In this workflow we use simple textprocessing techniques to reduce the complexity of the captions used for training. This will limit the words which our network is able to predict, i.e. makes the task a bit easier. Also, we will add special start and end tokens to the cleaned captions. As output, this workflow will write a table containing the vocabulary and a table containing the processed captions to the data folder.
Workflow
01_caption_preprocessing
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.2
Legal
By using or downloading the workflow, you agree to our terms and conditions.