01_Semi_Automated_ML

Workflow

Draft Latest edits on

This workflow is for semi-automated data blending from two different datasets. Each dataset consists of unique columns and overlapping columns that appear in both datasets. The first part of the workflow represents the machine learning algorithm, which matches corresponding rows (using numeric and string distance metrics to calculate the distance between the rows from tables 1 and 2, based on the selected columns. Based on this distance, the domain expert can decide whether they want to trust the prediction or inspect the results and correct them if needed. This is handled in the second part of the workflow. This workflow generalizes well, as the algorithmic part serves simply as an example and can be exchanged by any ML algorithm; the interactive views can be adapted easily to different use cases.

External resources

Will They Blend - ML Algorithms Meet Domain Experts

Loading deploymentsLoading ad hoc jobs

Legal

By using or downloading the workflow, you agree to our terms and conditions.