This node applies an Isolation Forest MOJO to an incoming Spark DataFrame/RDD in order to detect anomalies/outliers. The output of the node will consist of the input and, depending on the settings, one or two appended columns. One is the prediction which contains normalized anomaly score. The higher the score, the more likely it is an anomaly. The other (optionally) appended column contains the mean length of the predicted decision tree paths of each observation. The shorter, the more likely it is an anomaly.
- Type: MOJOMOJO (AnomalyDetection)The MOJO. Its model category must be anomaly detection.
- Type: Spark DataInput Spark DataFrame/RDDSpark DataFrame/RDD for prediction. Missing values will be treated as NA .