Hub
  • Software
  • Blog
  • Forum
  • Events
  • Documentation
  • About KNIME
  • KNIME Hub
  • Nodes
  • Clean HTML Retriever
NodeNode / Manipulator

Clean HTML Retriever

Community Nodes MMI Labs Streamable
Drag & drop
Like
Copy short link

This node takes URL from a column, retrieves its content (assuming to be in HTML form) for parsing. If HTML content is available in another column, it can take HTML content directly instead of pulling from URL. HTML content is then parsed and cleaned up using HtmlCleaner to output in XHTML form. The result can be configured to output in either String for XML type.

Node details

Input ports
  1. Type: Table
    URL / HTML input
    An input table that contains URL / content columns
Output ports
  1. Type: Table
    XHTML result
    An output table URL and XHTML results

Extension

The Clean HTML Retriever node is part of this extension:

  1. Go to item

Related workflows & nodes

    1. Go to item
    2. Go to item
    3. Go to item
    4. Go to item
    5. Go to item
    6. Go to item

    KNIME
    Open for Innovation

    KNIME AG
    Hardturmstrasse 66
    8005 Zurich, Switzerland
    • Software
    • Getting started
    • Documentation
    • E-Learning course
    • Solutions
    • KNIME Hub
    • KNIME Forum
    • Blog
    • Events
    • Partner
    • Developers
    • KNIME Home
    • KNIME Open Source Story
    • Careers
    • Contact us
    Download KNIME Analytics Platform Read more on KNIME Server
    © 2022 KNIME AG. All rights reserved.
    • Trademarks
    • Imprint
    • Privacy
    • Terms & Conditions
    • Credits