Word Parser

Source

This node allows you to read Word (.doc, .docx, .docm) documents and create a document for each file. The text is extracted from the word file by usage of the Apache POI library (see http://poi.apache.org/ for details). Paragraphs are taken into account. Meta information is not red. The first sentence is used as the document title.

Output Ports

  1. Type: Data
    An output table containing the parsed document data.

Extension

This node is part of the extension

KNIME Textprocessing

v4.0.0

Short Link

Drag node into KNIME Analytics Platform