This workflow aims at performing inference on boxoffice revenues (adjusted for dollar inflation with CPI) by combining information from structured data about movies from IMDb and scraped posters retrieved through TMDB and its API, complying with the related Terms of Service.
Text mining on movie titles is performed with the lexicon approach, while image feature analysis is performed in two steps, face detection through a pre trained convolutional neural network (MTCNN) and image feature extraction (labels and colors) through Google Cloud Vision API.
Portability is ensured through Knime URL protocol and Conda Environment Propagation.
To make the workflow run, all the data and all the workflows present in its Knime Hub directory need to be present in the working directory (please check their description in order to check their requirements and in order to configure them in the right way). The posters folder is just a placeholder, every poster needs to be scraped again, and it will be saved there.
Additionally, the main dataset needs to be downloaded (and put in the working directory) from the given link in the external resources section of this page. Please remove the space in the name of the downloaded .csv file before execution.
Workflow
The Impact of Movie Posters and Data on Boxoffice Revenues
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.1
- Go to item
KNIME HCS Tools
Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG), Dresden, Germany
Version 4.0.0
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Legal
By using or downloading the workflow, you agree to our terms and conditions.