Data Preparation
This workflow prepares the data for the next workflow ("My first Data Model") and uses some of the most common nodes for data preparation:
Applying different strategies for missing values (Missing Value node)
Creating subsets of the data (Row Sampler and Table Partitioner nodes)
Shuffling (Shuffle node)
Concatenation of data sets (Concatenate node)
Normalizing data (Normalizer and Normalizer (Apply) nodes)
After preprocessing, the workflow writes the two subsets back to .csv files, one for the training set (top partitioning), one for test set (bottom partitioning).