This workflow is for semi-automated data blending from two different datasets. Each dataset consists of unique columns and overlapping columns that appear in both datasets. The first part of the workflow represents the machine learning algorithm, which matches corresponding rows (using numeric and string distance metrics to calculate the distance between the rows from tables 1 and 2, based on the selected columns. Based on this distance, the domain expert can decide whether they want to trust the prediction or inspect the results and correct them if needed. This is handled in the second part of the workflow.
This workflow generalizes well, as the algorithmic part serves simply as an example and can be exchanged by any ML algorithm; the interactive views can be adapted easily to different use cases.
Workflow
01_Semi_Automated_ML
External resources
Used extensions & nodes
Created with KNIME Analytics Platform version 3.7.2 Note: Not all extensions may be displayed.
Legal
By using or downloading the workflow, you agree to our terms and conditions.