01_caption_preprocessing

Workflow

Draft Latest edits on

In this workflow we use simple textprocessing techniques to reduce the complexity of the captions used for training. This will limit the words which our network is able to predict, i.e. makes the task a bit easier. Also, we will add special start and end tokens to the cleaned captions. As output, this workflow will write a table containing the vocabulary and a table containing the processed captions to the data folder.

External resources

COCO Dataset Homepage
COCO 2014 Data Download
Image Captioning

Loading deploymentsLoading ad hoc jobs

Legal

By using or downloading the workflow, you agree to our terms and conditions.