Hub
Pricing About
NodeNode / Other

Spark Entropy Scorer

Tools & ServicesApache SparkMiningScoring
Drag & drop
Like

Scorer for clustering results given a reference clustering. Connect the Spark DataFrame/RDD containing a column with the reference cluster IDs as well as a column containing the clustering results to the input port. The respective columns can be selected in the dialog. After successful execution, the view will show entropy values (the smaller the better) and a quality value (in [0,1] - with 1 being the best possible value, as used in Fuzzy Clustering in Parallel Universes , section 6: "Experimental results").

Node details

Input ports
  1. Type: Spark Data
    Spark DataFrame/RDD
    Arbitrary input Spark DataFrame/RDD with at least two columns, where one column contains the reference clustering and one the clustering that shall be scored.
Output ports
  1. Type: Table
    Quality Table
    Table containing entropy values for each cluster. The last row contains statistics on the entire clustering. It corresponds to the table shown in the Statistics View.

Extension

The Spark Entropy Scorer node is part of this extension:

  1. Go to item

Related workflows & nodes

  1. Go to item
  2. Go to item
  3. Go to item

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits