WorkflowAddress Deduplication

Workflow preview
The workflow shows the power of the new distance measurement framework - a high prediction correctness of possible matches is achieved with a minimum number of nodes and without any preprocessing by just aggregating some distances on different attributes. The chosen data set is the "Restaurant data set" from comprising 864 restaurant records and 112 duplicates. Each record contains a name, an address, a city, a type and finally a class attribute. Records with an identical value in the class attribute point to the same real-word entity or restaurant in our case.