This workflow is part of a number of other workflows that address a data mining scenario at the intersection of active learning, text mining, stream mining and service-oriented knowledge discovery architectures.
In particular, this workflow trains two models. A Document Vector Model depending on the keywords extracted from the training set during the pre-processing step and a Random Forest model that makes the predictions of the document_class. The models created at this stage can be used later on in the active learning cycle (Re-label_Uncertain_Classes workflow).