Hub
Pricing About
WorkflowWorkflow

03_Document_Classification

BooksFrom Words To WisdomText Mining
vincenzo profile image
Draft Latest edits on 
Oct 20, 2014 2:00 PM
Drag & drop
Like
Download workflow
Workflow preview
This is a workflow for topic classification. After converting the Documents into word vectors, it becomes a traditional classification problem which can be solved using any Machine Learning supervised training algorithm. We chose a decision tree, but it could have been anything else. Metanode "Limit # keywords" artificially limits the number of extracted keywords to limit the number of produced columns. Since the dataset used here is quite small, we do not want to run the risk of lack of generalization by having too many columns vs. too few rows in the training set. Document Vector Applier node applies the word vector extracted in the training set and removes all words that might be present in the test set but not in the training set. Category To Class extracts the content in the category field of the Document and places it in a column named "class".

External resources

  • www.knime.com/knimepress/from-words-to-wisdom
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 4.5.0
  • Go to item
    KNIME Base nodesTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.0

    knime
  • Go to item
    KNIME TextprocessingTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.0

    knime

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits