Term Document Entropy
This node computes the informational entropy of each term in each document. The nodes requires a bag of words table as input and appends an additional column to the output table, containing the entropy values. If a term occurs once in every document, its entropy for each document is 0. Any other combination of frequencies determines an entropy weight between 0 and 1. Please note, that the computational complexity of of the entropy calculation is greater than the number of terms times the number of documents. For big bag of words input tables the computation can be quite time consuming.
- Type: Data The input table which contains terms and documents.
- Type: Data The output table with terms, documents and a corresponding entropy value.
Other Data Types > Text Processing > Frequencies
Make sure to have this extension installed:
Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site