Spark Correlation Matrix


This node computes the correlation matrix for the selected input columns using the MLlib Statistics package.

Input Ports

  1. Type: Spark Data Spark DataFrame/RDD to compute correlation matrix for.

Output Ports

  1. Type: Spark Data Spark DataFrame/RDD representing the correlation matrix.
  2. Type: Data Correlation variables in a square matrix
  3. Type: Correlation A model containing the correlation measures. This model is appropriate to be read by the Spark Correlation Filter node.

Find here

Tools & Services > Apache Spark > Statistics

Make sure to have this extension installed:

KNIME Extension for Apache Spark

Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site

How to install extensions