Lexicon Text Mining on Movie Titles with VAD

Workflow

Lexicon Text Mining on Movie Titles with VAD

Draft Latest edits on

This workflow can be run by itself, but it is in truth complemetary to a main one, which needs to be executed at least once before this one, at least until the node before the lexicon text mining module and the related table writer, in order to save in the working directory the necessary .table file. The main workflow is present in the same directory of this flow on Knime Hub. This workflow performs lexicon-based text mining on the movies’ titles using the three features present in the VAD dictionary: Valence, Arousal and Dominance. After adding the title length as a new variable, words are tagged with POS tagger and using the VAD dictionary. Then, in the pre-processing part punctuation is erased, numbers are filtered, everything is converted to lower case and all the titles without any tag are filtered out (for example, titles with only proper names). After creating the Bag of words (and added a Term column to keep each term without the tag attached), a joiner is used to add to each term tagged its value of Valence, Arousal and Dominance through the VAD dictionary. Eventually, through a GroupBy node, we grouped the term by the movies’ titles and keep the main data about the film and the value of Valence, Arousal and Dominance of the title.

External resources

VAD Lexicon

Loading deploymentsLoading ad hoc jobs

Legal

By using or downloading the workflow, you agree to our terms and conditions.