Document Preprocessing applies a common sequence of preprocessing steps to clean and prepare text for subsequent analysis and comparison with other text. As input, a column containing documents is expected and as output, the newly preprocessed documents is produced.
The Component requires the following extensions:
- KNIME Textprocessing
https://hub.knime.com/knime/extensions/org.knime.features.ext.textprocessing/latest
- Type: TableDocuments Input PortThe input table which contains the documents to preprocess.