Hub
  • Software
  • Blog
  • Forum
  • Events
  • Documentation
  • About KNIME
  • KNIME Hub
  • knime
  • Spaces
  • Examples
  • 04_Analytics
  • 14_Deep_Learning
  • 01_DL4J
  • 07_Simple_Document_Classification_Using_Word_Vectors
WorkflowWorkflow

Simple Document Classification using Word Vectors

Deeplearning Machine learning Word2vec Doc2vec Word vectors
+2

Last edited: 

Drag & drop
Like
Download workflow
Copy short link
Workflow preview
This example shows how to transform a document into a vector using a word vector model and using these vectors for classification. First, we read some test and train documents which are divided into three topics. We use the train dataset to train a Doc2Vec model using the topic as class attribute. The Word Vector learner now creates a vector for each word, and each label. Next, we use a Vocabulary extractor to extract the words and vectors from the model. On the second output port the Vocabulary Extractor will output the vectors for each label which we can then use as a kind of 'cluster center' for classification. The next step is to convert our test documents into a vector using the word vector model. This can be done using the Word Vector Apply Node. This Node takes in documents and replaces every word with its corresponding word vector if present in the word vector model. We additionally configure the Node to calculate the mean of all vectors so we have a single vector as representation of the test documents. At last we can now use a K Nearest Neighbor Node using our previously created 'cluster centers'. In the context of word vectors often the cosine distance is used. Workflow Requirements KNIME Analytics Platform 3.4.0 KNIME Deeplearning4J Integration KNIME Deeplearning4J Integration Text Processing Extension

External resources

  • KNIME Deeplearning4J Integration

Used extensions & nodes

Created with KNIME Analytics Platform version 4.1.0
  • Go to item
    KNIME Core Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

  • Go to item
    KNIME Distance Matrix Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

  • Go to item
    KNIME Textprocessing - Deeplearning4J Integration (64bit only) Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

  1. Go to item
  2. Go to item
  3. Go to item
  4. Go to item
  5. Go to item
  6. Go to item

Legal

By using or downloading the workflow, you agree to our terms and conditions.

Discussion
Discussions are currently not available, please try again later.

KNIME
Open for Innovation

KNIME AG
Hardturmstrasse 66
8005 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • E-Learning course
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • KNIME Open Source Story
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more on KNIME Server
© 2022 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Credits