Topic Scorer (Labs)

This component can compute different metrics of topics created by the Topic Extractor (Parallel LDA) node and Topic Extractor (STM) component. We list below the metrics it can score provided a table or pre-processed documents and a table of weighted terms for each topic. Provide the topics of a single model or of multiple models. Take a look at the example workflows at the bottom of this page to learn how to concatenate topics from different models trained on the same corpus of documents or add a ‘model ID’ to the output of the Topic Extractor (Parallel LDA) node. DISCLAIMER: this verified component is currently marked as part of KNIME Labs (knime.com/knime-labs). Provide feedback at upskilling@knime.com Topic Semantic Coherence score: This component calculates semantic coherence scores for each topic. Semantic coherence measures how coherent topics are by checking if the topics top terms appear together in the same documents more often than not. This experimental implementation is based on the paper by Mimno et al (2011) [dl.acm.org/doi/10.5555/2145432.2145462]. Topic Exclusivity score: This component calculates the exclusivity of topics. Exclusivity is computed using an experimental implementation of the FREX function by Bischof and Airoldi (2012) [dl.acm.org/doi/10.5555/3042573.3042578]. FREX does not take in consideration only how exclusive/unique terms are between different topics (top terms table), but also how rare those topics are in documents of the same topic (documents table). When comparing multiple models, documents can be assigned by different models to different topics and therefore exclusivity can be computed only using how unique terms are in the topics top terms table. Read more in the setting “Ignore Assigned Topic Column” description. Topic Neighbor Distance score: This component computes an experimental distance between topics within the same model or between several models. To do this, topics are represented by a normalized vector by pivoting the top terms by topic table. A cosine distance between topic vectors is computed. For each topic the distance is used to show the closest and farthest topic within one or between more models.

Component details

Input ports

Output ports

KNIME Base nodes

KNIME Data Generation

KNIME Distance Matrix

KNIME Javasnippet

KNIME Math Expression (JEP)

KNIME Quick Forms

KNIME Textprocessing

KNIME Timeseries nodes

Legal