Hub
Pricing About
NodeNode / Manipulator

Spark PCA

Tools & ServicesApache SparkMiningDimensionality Reduction
Drag & drop
Like

This node performs a principal component analysis (PCA) on the given data using the Apache Spark implementation . The input data is projected from its original feature space into a space of (possibly) lower dimension with a minimum of information loss.

Node details

Input ports
  1. Type: Spark Data
    Spark DataFrame/RDD
    Input Spark DataFrame/RDD
Output ports
  1. Type: Spark Data
    Projected Input DataFrame/RDD
    The input DataFrame/RDD projected onto the principal components. Input columns that were not included in the principal component analysis are retained.
  2. Type: Spark Data
    Principal Component Matrix
    A DataFrame/RDD with the principal components matrix.

Extension

The Spark PCA node is part of this extension:

  1. Go to item

Related workflows & nodes

  1. Go to item
  2. Go to item
  3. Go to item

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits