The TD_DecisionForest is an ensemble algorithm and widely used across a range of classification and regression predictive modeling problems. It is an extension of bootstrap aggregation (bagging) of decision trees. In bagging, a number of decision trees are created where each tree is created from a different bootstrap sample of the training dataset. A bootstrap sample is a sample of the training dataset where a sample may appear more than once, referred to as sampling with replacement. It also involves selecting a subset of input features (columns or variables) at each split point in the construction of trees. Typically, constructing a decision tree involves evaluating the value for each input variable in the data in order to select a split point. By reducing the features to a random subset that may be considered at each split point, it forces each decision tree in the ensemble to be more different. The TD_DecisionForest function uses a training data set to create a predictive model. You can input the model to the TD_DecisionForestPredict function, which uses it to make predictions. A prediction on a regression problem is the average of the prediction across the trees in the ensemble. A prediction on a classification problem is the majority vote for the class label across the trees in the ensemble.
- Type: DB SessionTeradata ConnectionConnection to a Teradata Database Instance
- Type: TableinputtableSpecifies the table containing the input data.