Sampling Strategies Comparison

Workflow

Draft Latest edits on

Experiment with: - simple random sampling - stratified random sampling (Partitioning node) - undersampling (Equal Size Sampling node) - oversampling (Bootstrap Sampling node and SMOTE node) The workflow draws on the kaggle Stroke Prediction Dataset that represents 5110 rows with 11 clinical features such as body mass index, smoking status, age, gender, and glucose level. The task is to predict stroke (yes/no), which is a classification problem. We chose to build a Random Forest model.

Loading deploymentsLoading manual runs

Legal

By using or downloading the workflow, you agree to our terms and conditions.