Solution to an L4-TP SELF-PACED COURSE exercise. Create a bag of words of a document. Calculate document frequencies (DF), term frequencies (TF), inverse document frequencies (IDF), and TF-IDF scores.
CHECK YOUR ANSWERS:
- The word "text" occurs in the agenda of the L4-TP course 4 times and therefore most often
- 28 words occur in both agendas
- The words with the highest TF-IDF scores are time (0.038), series (0.038), and text (0.025)
Workflow
05 Bag of Words and Frequencies - Solution
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.4.0
Legal
By using or downloading the workflow, you agree to our terms and conditions.