Spark Statistics


This node computes summary statistics for the selected input columns using the MLlib Statistics package.

Computed statistics:

  • Minimum value
  • Maximum value
  • Sample mean
  • Sample variance
  • L1 norm
  • L2 norm
  • Number of nonzero elements
  • Number of zero elements
  • Row count

Input Ports

  1. Type: Spark Data Spark DataFrame/RDD to compute statistics for.

Output Ports

  1. Type: Data Table with numeric values.

Find here

Tools & Services > Apache Spark > Statistics

Make sure to have this extension installed:

KNIME Extension for Apache Spark

Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site

How to install extensions