Hub
Pricing About
WorkflowWorkflow

Machine Learning Meta Collection (with KNIME)

KnimeMachineLearningModelScore
+9
mlauber71 profile image
Draft Latest edits on 
May 6, 2024 5:37 PM
Drag & drop
Like
Download workflow
Workflow preview
Machine Learning Meta Collection (with KNIME) This meta collection is about machine learning. It contains links to some examples demonstrating several types of machine learning mosttly with KNIME and also some links how to learn machine learning (again mostly witth KNIME). It is not a complete collection of ML methods and algorithms and far from answering all questions or covering all topics - more like a quick practical overview of some aspects; and always with a focus on Mnimal Viable Examples you could try at home. Please note these examples do not substitute for a deeper understanding of your business problems and the various -statistical- implications to consider when using such models - in other words: terms and conditions *do* apply. --------------- Learning Machine Learning (with KNIME) --------- How to learn machine learning with KNIME https://forum.knime.com/t/knime-based-machine-learning-course/21876/2?u=mlauber71 [L1-DS] - KNIME Analytics Platform for Data Scientists: Basics Lesson 4. Machine Learning & Data Export https://www.knime.com/self-paced-course/l1-ds-knime-analytics-platform-for-data-scientists-basics/lesson4?u=mlauber71 ----------------------------------------------------------------- Links to types of prediction models https://forum.knime.com/t/how-to-find-the-optimal-process-parameter-based-on-quality-defects/20846/6?u=mlauber71 ----- 1) Models for binary classsifications - 0/1 or Yes/No Targets https://forum.knime.com/t/looking-for-options-to-evaluate-a-decision-tree/11384/2?u=mlauber71   Understand metrics like AUC and Gini (and use H2O.ai) https://forum.knime.com/t/random-forest-model-not-working/12738/3?u=mlauber71 https://forum.knime.com/t/help-choosing-analytics-algorithm/11404/3?u=mlauber71 11 Important Model Evaluation Metrics for Machine Learning Everyone should know https://www.analyticsvidhya.com/blog/2019/08/11-important-model-evaluation-error-metrics/ ----- 2) Model for Multiclass Targets (and explanation of Log Loss statistics) https://forum.knime.com/t/any-advice-to-improve-the-performance-of-a-classification-model/12801/10?u=mlauber71 https://forum.knime.com/t/metrics-in-multiclass-classification/11193/3?u=mlauber71 Score Documents with multiple Classes? https://forum.knime.com/t/urgent-what-is-wrong-with-my-decision-tree-predictor-for-new-data/13292/10?u=mlauber71 ----- 3) Regression models (numeric Target) https://forum.knime.com/t/predictive-analytics-for-sales/12858/3?u=mlauber71 https://forum.knime.com/t/forecasting-sales-per-customer-for-the-next-360-days/13221/4?u=mlauber71 https://forum.knime.com/t/evaluate-a-linear-regression-model/13305/2?u=mlauber71 https://forum.knime.com/t/how-to-identify-the-top-100-features-selected-from-mlp-model/11371/2?u=mlauber71   Regression collection (Time Series) https://forum.knime.com/t/prediction-based-on-multi-variables/20184/5?u=mlauber71 predict how many future visitors a restaurant will receive (with H2O.ai) https://www.knime.com/blog/solving-a-kaggle-challenge-using-the-combined-power-of-knime-analytics-platform-h2o?u=mlauber71 ------------------------------------------------------------ PMML Models with numeric scores https://forum.knime.com/t/export-pmml-that-outputs-class-probabilities/13244/2?u=mlauber71 ----------------------------------------------------------------- Data preparation steps [preparation] Techniques for Dimensionality Reduction https://hub.knime.com/knime/spaces/Examples/latest/04_Analytics/01_Preprocessing/02_Techniques_for_Dimensionality_Reduction/02_Techniques_for_Dimensionality_Reduction~7PBv1kGifxCng2qo [preparation] Three New Techniques for Data Dimensionality Reduction in Machine Learning https://www.knime.com/blog/three-new-techniques-for-data-dimensionality-reduction-in-machine-learning [preparation] use R's vtreat to automatically prepare data fo classification and regression tasks https://forum.knime.com/t/is-artificial-intelligence-used-for-data-cleansing-techniques-used-by-knime/36209/6?u=mlauber71 [preparation] Spark Label Encoding, remove highly correlated variables - prepare the data in local Big Data environment https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_h2o_automl_spark/s_401_spark_label_encoder~mF4g6HTMX7J4m27Q prepare the preparation of data in a big data environment - label encode string variables - transform numbers into Double format (Spark ML likes that) - remove highly correlated data - remove NaN variables - remove continous variables - optional: normalize the data ----------------------------------------------------------------- How to handle missing values Basic missing value handling https://hub.knime.com/knime/spaces/Examples/latest/02_ETL_Data_Manipulation/04_Transformation/01_Handling_Missing_Values some more advanced approaches to missing values https://hub.knime.com/knime/spaces/Education/latest/Courses/L4-ML%20Introduction%20to%20Machine%20Learning%20Algorithms/Session_4/02_Solutions/02_Missing_Value_Handling_solution Multipe Imputation for Missing Values https://hub.knime.com/kathrin/spaces/Missing%20Value%20Imputation/latest/Mulitple%20Imputation%20for%20Missing%20Values Comparing Missing Value Handling Methods https://hub.knime.com/kathrin/spaces/Missing%20Value%20Imputation/latest/Comparing%20Missing%20Value%20Handling%20Methods Employ R's Amelia package to replace missing values https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_r_amelia/m_001_missing_values_amelia ----------------------------------------------------------------- about unbalanced Targets https://forum.knime.com/t/xgboost-predictor/23960/5?u=mlauber71 about unbalanced data and evaluation metrics (AUCPR) https://forum.knime.com/t/problem-with-unbalanced-data-with-examples-attached/26227/4?u=mlauber71 another thread about how to handle imbalanced data https://forum.knime.com/t/knime-fraud-detection-autoencoder/28859/17?u=mlauber71 --------------- KNIME and H2O.ai ---------- H2O.ai models and KNIME in general https://www.knime.com/nodeguide/analytics/h2o-machine-learning?u=mlauber71 simple example how to use H2O.ai models in a Big Data environment https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_h2o_sparkling_water?u=mlauber71 H2O.ai AutoML in KNIME for classification problems https://forum.knime.com/t/h2o-ai-automl-in-knime-for-classification-problems/20923?u=mlauber71 H2O.ai AutoML in KNIME for regression problems https://forum.knime.com/t/h2o-ai-automl-in-knime-for-regression-problems/20924?u=mlauber71 „Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster with KNIME, Spark and H2O.ai“ (talk in German, slides in English) https://www.youtube.com/watch?v=k8MsxzwEVrk&t=4335s --------------- KNIME and Python ---------- use Python and KNIME to make a random forest (quick basic example) https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_python_iris?u=mlauber71 Python Installation (the very short story) https://forum.knime.com/t/problem-with-setting-a-python-deep-learning-environment/19477/2?u=mlauber71 https://forum.knime.com/t/installing-a-new-library-in-python/25365/4?u=mlauber71 Python KNIME official installation https://docs.knime.com/2020-07/python_installation_guide/index.html?u=mlauber71 Python and Deep Learning https://docs.knime.com/latest/deep_learning_installation_guide/index.html?u=mlauber71 Python and Anaconda versions / Python and Keras https://forum.knime.com/t/python-extension-not-recognizing-anaconda-environment-in-knime-3-7/12978/3?u=mlauber71 https://forum.knime.com/t/python-extension-not-recognizing-anaconda-environment-in-knime-3-7/12978/9?u=mlauber71 --------------- Special ---------- Rule Induction with Weka Rule Nodes and Yacaree Associator https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_rule_induction_weka_hotspot_and_yacaree_rules?u=mlauber71 Not strictly a KNIME thing but very helpful books and blogs about ML and Python https://machinelearningmastery.com/ Clustering Algorithms (small collection in KNIME) https://forum.knime.com/t/ml-techniques-which-one-can-i-use-to-predict-sales-in-a-particular-country/28783/5?u=mlauber71

External resources

  • [preparation] use R's vtreat to automatically prepare data fo classification and regression tasks
  • [preparation] Spark Label Encoding, remove highly correlated variables - prepare the data in local Big Data environment
  • [preparation] Three New Techniques for Data Dimensionality Reduction in Machine Learning
  • [preparation] Techniques for Dimensionality Reduction
  • [H2O.ai] „Sparkling Predictions and Encoded Labels – Developing and Deploying Predictive Models on a Big Data Cluster with KNIME, Spark and H2O.ai“ (talk in German, slides in English)
  • [H2O.ai] AutoML in KNIME for regression problems
  • [H2O.ai] AutoML in KNIME for classification problems
  • [H2O.ai] simple example how to use H2O.ai models in a Big Data environment
  • [H2O.ai] models and KNIME in general
  • [unbalanced] another thread about how to handle imbalanced data
  • [unbalanced] about unbalanced data and evaluation metrics (AUCPR)
  • [unbalanced] about unbalanced Targets
  • [missings] Employ R's Amelia package to replace missing values
  • [missings] Comparing Missing Value Handling Methods
  • [missings] Multipe Imputation for Missing Values
  • [missings] some more advanced approaches to missing values
  • [missings] Basic missing value handling
  • PMML Models with numeric scores
  • [regression] predict how many future visitors a restaurant will receive (with H2O.ai)
  • [regression] Regression collection (Time Series)
  • [regression] Regression models (numeric Target) (4)
  • [regression] Regression models (numeric Target) (3)
  • [regression] Regression models (numeric Target) (2)
  • [regression] Regression models (numeric Target) (1)
  • [multiclass] Score Documents with multiple Classes?
  • [multiclass] Model for Multiclass Targets (and explanation of Log Loss statistics) (2)
  • [multiclass] Model for Multiclass Targets (and explanation of Log Loss statistics) (1)
  • [binary] 11 Important Model Evaluation Metrics for Machine Learning Everyone should know
  • [binary] Understand metrics like AUC and Gini (and use H2O.ai) (2)
  • [binary] Understand metrics like AUC and Gini (and use H2O.ai) (1)
  • [binary] Models for binary classsifications - 0/1 or Yes/No Targets
  • Links to types of prediction models
  • [L1-DS] - KNIME Analytics Platform for Data Scientists: Basics - Lesson 4. Machine Learning & Data Export
  • How to learn machine learning with KNIME
  • Rule Induction with Weka Rule Nodes and Yacaree Associator
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

All required extensions are part of the default installation of KNIME Analytics Platform version 4.7.8

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits