Here you can see an example for four basic data preparation steps: conversion to number and to category, missing value imputation, normalization, SMOTE. Notice also the node (Apply) in the testing part of the workflow to avoid data leakage.
The workflow trains a logistic regression for the binary classification problem of churn prediction using the telco dataset. Instead of the logistic regression any other classification algorithm could be used. However, the Learner-Predictor construct is common to all supervised algorthms.
Workflow
Four basic steps in Data Preparation before Training a Churn Predictor
Used extensions & nodes
Created with KNIME Analytics Platform version 4.3.2
Legal
By using or downloading the workflow, you agree to our terms and conditions.