02_Document Vector Creation

Workflow

Draft Latest edits on

Here we transform the collection of documents into numerical vectors. The dataset used in this example is the KNIME Forum Dataset. After the pre-processing phase, the relative term frequency is computed for each term inside the Transformation component. The input data set is partitioned into training set and test set. The term frequencies from the training set are used to build a vector representation of the distinct terms identified by the BoW with a Document Vector node.The same Document Vector transformation is then applied to the Documents in the test set.

External resources

www.knime.com/knimepress/from-words-to-wisdom

Loading deploymentsLoading ad hoc jobs

Legal

By using or downloading the workflow, you agree to our terms and conditions.