Handling sparse categorial variables with Word2Vec

Workflow

Handling sparse categorial variables with Word2Vec

Draft Latest edits on

This workflow shows how to compute word embedding on a set of categorical variables with the granularity which allows them to be used as input of predictive models. A PCA Analysis is directly applied inside the component. I kept only a portion of embedding dimensions which caputures most of the variation, by doing so you can monitor model complexity. py package required: pandas gensim numpy nltk

Loading deploymentsLoading ad hoc jobs

Legal

By using or downloading the workflow, you agree to our terms and conditions.