After pre-processing and cleaning the text in the Documents, we can now create their bag of words.
All nodes preceding the Bag of Words part have been encapsulated in components, in order to make the workflow better readable.
Comparing terms and lemmatized terms. Here we have 2 bags of words. One bag of words is created directly from the text in the documents; while the second bag of words is created from the same lemmatized document. The original terms and the corresponding lemmatized terms are then joined together.
Workflow
04_Bags of words and terms
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.0
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc jobs
Legal
By using or downloading the workflow, you agree to our terms and conditions.