Here you can see an example for four basic data preparation steps: conversion to number and to category, missing value imputation, normalization, SMOTE. Notice also the node (Apply) in the testing part of the workflow to avoid data leakage. The workflow trains a logistic regression for the binary classification problem of churn prediction using the telco dataset. Instead of the logistic regression any other classification algorithm could be used. However, the Learner-Predictor construct is common to all supervised algorthms.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.3.2
Loading ad hoc executions
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.