This workflow shows how to compute word embedding on a set of categorical variables with the granularity which allows them to be used as input of predictive models.
A PCA Analysis is directly applied inside the component. I kept only a portion of embedding dimensions which caputures most of the variation, by doing so you can monitor model complexity.
py package required:
pandas
gensim
numpy
nltk
Workflow
Handling sparse categorial variables with Word2Vec
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.1
Note: Not all extensions may be displayed.
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Loading deployments
Loading ad hoc executions
Legal
By using or downloading the workflow, you agree to our terms and conditions.
Discussion
Discussions are currently not available, please try again later.