Hub
Pricing About
NodeNode / Learner

Spark Random Forest Learner

Tools & ServicesApache SparkMiningPrediction
Drag & drop
Like

A random forest* is an ensemble of decision trees. Learning a random forest model means training a set of independent decision trees in parallel. This node uses the spark.ml random forest implementation to train a classification model in Spark. The target column must be nominal, whereas the feature columns can be either nominal or numerical.

Use the Spark Predictor (Classification) node to apply the learned model to unseen data.

Please refer to the Spark documentation for a full description of the underlying algorithm.

This node requires at least Apache Spark 2.0.

(*) RANDOM FORESTS is a registered trademark of Minitab, LLC and is used with Minitab’s permission.

Node details

Input ports
  1. Type: Spark Data
    Input data
    Input Spark DataFrame with training data.
Output ports
  1. Type: Table
    Feature importance measures
    Table with estimates of the importance of each feature. The features are listed in order of decreasing importance and are normalized to sum up to 1.
  2. Type: Spark ML Model
    Spark ML Random Forest model (classification)
    Spark ML random forest model (classification)

Extension

The Spark Random Forest Learner node is part of this extension:

  1. Go to item

Related workflows & nodes

  1. Go to item
  2. Go to item
  3. Go to item

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits