This node has the same function as the Tika Parser node, which is to parse any documents that are supported by Tika. The difference is that this node takes file paths from a string column as input. The type of the files can be selected in the configuration dialog. Users have the choice between selecting the file extensions, or the MIME-types. What kind of information that are to be extracted from the file (metadata and content) can also be selected in the dialog. If possible, user can also extract files that are embedded in the input files, such as attachments in E-mails, etc, and store them in a specified directory. Authentication setting is also provided to parse any encrypted files.
- Type: TableTable containing the filepathsThe input table containing the URLs or paths to files that are to be parsed. The input table has to contain at least one String column.