Hub
  • Software
  • Blog
  • Forum
  • Events
  • Documentation
  • About KNIME
Sign in
  • KNIME Hub
  • knime
  • Spaces
  • Examples
  • 08_Other_Analytics_Types
  • 01_Text_Processing
  • 02_Document_Classification
WorkflowWorkflow

Document Classification: Model Training and Deployment

NLP Natural Language Processing Text Classification

Last update: 

Drag Workflow
Workflow preview
The goal of this workflow is to do spam classification using YouTube comments as the dataset. The workflow starts with a data table containing some YouTube comments taken from the YouTube Spam Collection Data Set at the UCI ML Repository[1] . The data is available in the workflow directory. The comments are divided into two categories, spam and ham (non-spam). The distribution of the values in both categories is roughly equal. First, the comments are converted into documents, whose category is the class spam or ham. The documents are then preprocessed by filtering and stemming. After that, the documents are transformed into a bag of words, which is filtered again. Only terms that occur at least in 1% of the documents (at least in 3 documents) will be used as features and not be filtered out. Then the documents are transformed into document vectors. The document vectors are a numerical representation of documents and are in the following used for classification via a support vector machine. The lower part contains the deployment workflow.

External resources

  • Sentiment Classification of Documents
  • YouTube Spam Collection Dataset

Used extensions & nodes

Created with KNIME Analytics Platform version 4.1.0
  • KNIME Core Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

  • KNIME Textprocessing Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

Legal

By downloading the workflow, you agree to our terms and conditions.

License (CC-BY-4.0)
Short link
Discussion
Discussions are currently not available, please try again later.

KNIME
Open for Innovation

KNIME AG
Hardturmstrasse 66
8005 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • E-Learning course
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • KNIME Open Source Story
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more on KNIME Server
© 2021 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Credits