Here you can see an example for four basic data preparation steps: conversion to number and to category, missing value imputation, normalization, SMOTE. Notice also the node (Apply) in the testing part of the workflow to avoid data leakage. The workflow trains a logistic regression for the binary classification problem of churn prediction using the telco dataset. Instead of the logistic regression any other classification algorithm could be used. However, the Learner-Predictor construct is common to all supervised algorthms.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.3.2
Loading ad hoc jobs
By using or downloading the workflow, you agree to our terms and conditions.