This workflow applies the Topic Extractor (Parallel LDA) node to detect 10 topics and describe each one of them with 5 keywords. LDA is a generative probabilistic model considered an unsupervised algorithm that finds out the top n topics, described by the most relevant m keywords. This is implemented in KNIME Analytics Platform through the Topic Extractor (Parallel LDA) node available within the Text Processing extension. LDA represents documents as random mixtures over latent topics, where each topic is characterized by a distribution over words (Blei, Ng and Jordan, 2003). The overall workflow constitutes the training model. In addition to the Topic Extractor (Parallel LDA) node the workflow includes the following steps: importing, cleaning up, and transforming the data.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.0
Loading ad hoc executions
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.