Hub
Pricing About
NodeNode / Manipulator

Clean HTML Retriever

Community NodesMMI LabsStreamable
Drag & drop
Like

This node takes URL from a column, retrieves its content (assuming to be in HTML form) for parsing. If HTML content is available in another column, it can take HTML content directly instead of pulling from URL. HTML content is then parsed and cleaned up using HtmlCleaner to output in XHTML form. The result can be configured to output in either String for XML type.

Node details

Input ports
  1. Type: Table
    URL / HTML input
    An input table that contains URL / content columns
Output ports
  1. Type: Table
    XHTML result
    An output table URL and XHTML results

Extension

The Clean HTML Retriever node is part of this extension:

  1. Go to item

Related workflows & nodes

  1. Go to item
  2. Go to item
  3. Go to item

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits