Hub
Pricing About
  • Software
  • Blog
  • Forum
  • Events
  • Documentation
  • About KNIME
  • KNIME Community Hub
  • Nodes
  • Clean HTML Retriever
NodeNode / Manipulator

Clean HTML Retriever

Community Nodes MMI Labs Streamable
Drag & drop
Like
Copy short link

This node takes URL from a column, retrieves its content (assuming to be in HTML form) for parsing. If HTML content is available in another column, it can take HTML content directly instead of pulling from URL. HTML content is then parsed and cleaned up using HtmlCleaner to output in XHTML form. The result can be configured to output in either String for XML type.

Node details

Input ports
  1. Type: Table
    URL / HTML input
    An input table that contains URL / content columns
Output ports
  1. Type: Table
    XHTML result
    An output table URL and XHTML results

Extension

The Clean HTML Retriever node is part of this extension:

  1. Go to item

Related workflows & nodes

  1. Go to item
  2. Go to item
  3. Go to item
  1. Go to item
  2. Go to item
  3. Go to item
  4. Go to item
  5. Go to item
  6. Go to item

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • E-Learning course
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • KNIME Open Source Story
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more on KNIME Business Hub
© 2023 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Credits