The Topic Scorer (Labs) verified component implements an experimental score for semantic coherence, exclusivity and similarity/distance of topics of one or multiple models.
As K increases trends can be found using these scores and a decision for the optimal K can be used. Usually it is a tradeoff between those metrics. Read more on the Verified Components descriptions.
Workflow
Topic Modeling with LDA: Optimizing K via a Verified Component
External resources
- Optimizing semantic coherence in topic models - Mimno et al 2011, Proceedings of the Conference on Empirical Methods in Natural Language Processing 2011
- Summarizing topical content with word frequency and exclusivity - Bischof and Airoldi (2012), Proceedings of the 29th International Coference on International Conference on Machine Learning
- Topic Scorer (Labs) - KNIME Community Hub
- Verified Components project - knime.com
Used extensions & nodes
Created with KNIME Analytics Platform version 4.7.4
Legal
By using or downloading the workflow, you agree to our terms and conditions.