Here we transform the collection of documents into numerical vectors. The dataset used in this example is the KNIME Forum Dataset.
After the pre-processing phase, the relative term frequency is computed for each term inside the Transformation component.
The input data set is partitioned into training set and test set.
The term frequencies from the training set are used to build a vector representation of the distinct terms identified by the BoW with a Document Vector node.The same Document Vector transformation is then applied to the Documents in the test set.
Workflow
02_Document Vector Creation
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.0
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc jobs
Legal
By using or downloading the workflow, you agree to our terms and conditions.