Dictionary Tagger (Multi Column)
This node recognizes named entities specified in one or more dictionary columns and assigns a specified tag value and type. Optionally, the recognized named entity terms can be set unmodifiable, meaning that the terms are not modified or filtered afterwards by any following preprocessing node. However, succeeding tagging nodes can overwrite tags of an unmodifiable term.
If the same entity is contained in different dictionaries, it will be tagged for every fitting configuration. For example, the document contains the term "London" and "London" is also contained in three different dictionaries, it will be tagged with all three tags that have been set for the specific dictionaries.
The sequence of the tags depends on the order of the dictionaries within the node dialog. The order can be changed by using the up/down arrow buttons.
Note, if there are any multi word entities in your dictionary and there is a succeeding dictionary containing one word of the multi word entity, the single word will be tagged only.
- Document: "New York is beautiful."
- Dictionary 1: "New York"
- Dictionary 2: "York"
In this case only "York" will be tagged. If there is a third dictionary containing "New York" as well, "New York" will be tagged with the tags set for the first and the third dictionary.
The order of the entities within a dictionary is also important. Equally as the order of the dictionaries, the first entity in the dictionary will be tagged first.
- Type: Data The input table containing the documents to tag.
- Type: Data The input table containing one or multiple dictionary columns.
- Type: Data An output table containing the tagged documents.
Other Data Types > Text Processing > Enrichment
Make sure to have this extension installed: