Gradient Boosted Trees Learner

Learner

Learns Gradient Boosted Trees with the objective of classification. The algorithm uses very shallow regression trees and a special form of boosting to build an ensemble of trees. The implementation follows the algorithm in section 4.6 of the paper "Greedy Function Approximation: A Gradient Boosting Machine" by Jerome H. Friedman (1999). For more information you can also take a look at this.

The used base learner for this ensemble method is a simple regression tree as it is used in the Tree Ensemble , Random Forest and Simple Regression Tree nodes. Per default a tree is build using binary splits for numeric and nominal attributes (the later can be changed to multiway splits). The built-in missing value handling tries to find the best direction for missing values to go to by testing each possible direction and selecting the one yielding the best result (i.e. largest gain).

Sampling

This node allows to perform row sampling (bagging) and attribute sampling (attribute bagging) similar to the random forest and tree ensemble nodes. If sampling is used this is usually referred to as Stochastic Gradient Boosted Trees. The respective settings can be found in the Advanced Options tab.

Input Ports

  1. Type: Data
    The data to learn from. It must contain at least one nominal target column and either a fingerprint (bit/byte/double vector) column or another numeric or nominal column.

Output Ports

  1. Type: Gradient Boosting Model
    The trained model.

Extension

This node is part of the extension

KNIME Core

v4.0.0

Short Link

Drag node into KNIME Analytics Platform