StanfordNLP NE Learner
The StanfordNLP NE Learner creates a conditional random field model based on documents and a dictionary with entities that occur in the documents. The chosen tag and the used dictionary will be saved internally, so they can be used by the StanfordNLP NE tagger to tag new documents and validate the model. If you want to use the model externally, the model file can be found at your workflow directory:
/%KNIMEWORKSPACE%/%WORKFLOW%/StanfordNLP NE Learner(##)/port_1/object/portobject.zip
You can select the document column and the dictionary column to train your model with. It is possible to use multi-term entities within the dictionary. There is also a tab in the dialog to specify the learner properties. Currently, there are only a few options, since the number of parameters is pretty huge. So please contact us, if there are important/highly used parameters, we should integrate!
NOTE: If you are interested in the StanfordNLP toolkit, please visit http://nlp.stanford.edu/software/. Some of the following property descriptions are taken from the NERFeatureFactory class from StanfordNLP. Please look into it for further information.
- Type: Data The input table containing the documents to train the model with.
- Type: Data The input dictionary containing known single- and/or multi-term entities to train the model.
- Type: StanfordNERModelPortObject The StanfordNLP NE model.
Other Data Types > Text Processing > Enrichment
Make sure to have this extension installed:
Update site for KNIME Analytics Platform 3.7:
KNIME Analytics Platform 3.7 Update Site