This node allows you to read Word (.doc, .docx, .docm) documents and create a document for each file. The text is extracted from the word file by usage of the Apache POI library (see http://poi.apache.org/ for details). Paragraphs are taken into account. Meta information is not red. The first sentence is used as the document title.
- Type: Data An output table containing the parsed document data.
Other Data Types > Text Processing > IO
Make sure to have this extension installed:
Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site