Learns a generative label model from the provided label source columns. This node is a key component for the realization of weak supervision approaches as popularized by Snorkel . The idea in weak supervision is that it is often possible to create a number of simple inaccurate models (e.g. simple rules or existing models for slightly different tasks) that can label unlabeled data and that the agreements and disagreements of these simple models can be analyzed to infer information on the true label. Our implementation is a TensorFlow based adaptation of the matrix completion approach proposed in this paper by the Snorkel team. We refer to the publication for details on the strategy.
- Type: Data Table containing label sources. A label source is either a nominal or a probability distribution column. Note that missing values in a label source are interpreted as abstains i.e. it is assumed that a missing value indicates that the label source did decide not to label the corresponding row. In case of nominal columns, label sources without a set of possible values assigned are ignored during the computation and a corresponding warning is displayed on the node.
- Type: Weak Label Model A weak label model that can be applied to data with the Weak Label Model Predictor.
- Type: Data Each row in this table gives the conditional probabilities that the label source displayed in the Label Source column takes on a specific value given the true label displayed in the Latent Label column.
Related workflows & nodes
- This workflow shows how to use the Weak Label Model Learner and Predictor nodes to aggregate sources of weak supervisio…
- This workflow defines a fully automated web based application that will label your data using weak supervision. The wor…
- This workflow defines a fully automated web based application that will label your data using active learning and weak …