Hub
  • Software
  • Blog
  • Forum
  • Events
  • Documentation
  • About KNIME
  • KNIME Hub
  • knime
  • Spaces
  • Examples
  • 04_Analytics
  • 15_H2O_Machine_Learning
  • 08_H2O_Isolation_Forest_Outlier_Detection
WorkflowWorkflow

H2O Isolation Forest for Outlier Detection

H2O Machine learning Isolation Forest Outlier Anomaly
+2

Last edited: 

Drag & drop
Like
Download workflow
Copy short link
Workflow preview
This tutorial shows how to train an H2O Model in KNIME. We will train an Isolation Forest Model to detect frauds, i.e. outliers or anomalies, in a credit card dataset (https://www.kaggle.com/mlg-ulb/creditcardfraud). 1. Prepare: Load the data and import the resulting KNIME Table to H2O. 2. Learn: We learn the Isolation Forest Model using the H2O Isolation Forest Learner. We want H2O to build 100 trees. All other model parameters are H2Os defaults. 3. Predict: Make predictions on the same data using your model(s). In the output there will be the predictions (normalized anomaly score) and mean lengths of the predicted decision tree paths. For further processing we convert the H2O Frame back to table. 4. Evaluate: Convert the "Class" column to a string column in order to use it in the ROC Curve (Javascript) node as class column. Its view gives as a visualization of the performance of our model(s). 5. Classify: If we know that about 5 percent of our data rows are anomalies, we can calculate the 95th quantile. This quantile can be used as a threshold by the Rule Engine node to classify each row either as an anomaly or not. For legal reasons we are not allowed to ship the dataset from Kaggle with our workflow. To get access to the data you have to sign in to Kaggle and accept the conditions of participation for the competetion. Afterwards you can download the data, unzip it, adjust the path pointing to the files in the node dialog of the CSV Reader node and run the workflow. You can find the kaggle challenge here: https://www.kaggle.com/mlg-ulb/creditcardfraud.

External resources

  • H2O Isolation Forest documentation

Used extensions & nodes

Created with KNIME Analytics Platform version 4.1.0
  • Go to item
    KNIME Core Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

  • Go to item
    KNIME H2O Machine Learning Integration Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

  • Go to item
    KNIME JavaScript Views Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

  1. Go to item
  2. Go to item
  3. Go to item
  4. Go to item
  5. Go to item
  6. Go to item

Legal

By using or downloading the workflow, you agree to our terms and conditions.

Discussion
Discussions are currently not available, please try again later.

KNIME
Open for Innovation

KNIME AG
Hardturmstrasse 66
8005 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • E-Learning course
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • KNIME Open Source Story
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more on KNIME Server
© 2022 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Credits