Here we execute the workflow in a streming fashion. The aim of this workflow is to create a vector space with the collection of documents being analzsed, bz using the Document Vector Hashing node. The node creates document vectors with a fixed number of dimensions using various hashing methods. This workflow starts reading the data and converts the strings into documents, which are then preprocessed, i.e. filtered and stemmed; all in a streaming fashion. All the preprocessing steps take place in the Streaming Pre-processing component. Then a bag of word is created and finally the documents are transformed into numerical/binary document vectors with the Document vector hashin node. The all workflow is executed in a streaming fashion.
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.0
By using or downloading the workflow, you agree to our terms and conditions.
Discussions are currently not available, please try again later.