This workflow defines a fully automated web based application that will label your data using active learning and weak supervision. The workflow was designed for business analysts to easily go through documents to be labeled in any number of classes. In each iteration the user labels more documents and the model is trained using the already labeled instances. With every new iteration, the model proposes documents based on a exploration vs exploitation approach. Once the user is happy with the overall potential falling below a certain value, they can exit the loop and export the model to label the remaining instances. Additionally the workflow lets the user defines rules to label instantly a portion of the dataset with a certain condition. These rules provide weak signals for the weak supervision model training. Rules can be updated at any iteration.
This workflow is made to be deployed on KNIME WebPortal via KNIME Server.
Workflow
Guided Labeling for Document Classification
External resources
- KNIME Blog - Labeling with Active Learning (Uncertainty Sampling)
- KNIME WebPortal - knime.com
- Data License: Database Contents License (DbCL) v1.0
- Data Source: Kaggle - IMDB Review Dataset
- Active learning for object classification - Nicolas Cebron et al - Data Min Knowl Disc (2009)
- Burr Settles, Active Learning Literature Survey, 2010 - Chapter 3.1 Uncertainty Sampling
Used extensions & nodes
Created with KNIME Analytics Platform version 4.5.1
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
- Go to item
Legal
By using or downloading the workflow, you agree to our terms and conditions.