This node calculates the string similarity score between values from one column in a reference table and one or more columns from the comparison table. Additionally, it allows to filter the rows based on if there is a similar match or not.
The node computes the similarity between strings using configurable algorithms such as Levenshtein Edit Distance , Longest Common Subsequence , or Positional Matching , and applies a user-defined threshold to determine which rows are considered a match.
Each field value in the comparison input is matched against each reference input value. The best match of any of these single comparisons is decisive for the filter decision. So we consider the columns in the comparison table as well as the rows from the reference table combined in a logical OR fashion.
The comparison algorithms are described in detail in the options section.
As output of this node the user can choose to select the rows that match according to the filter value, the rows that do not match the criteria or all rows.
In either case additional columns can be generated that contain the computed criteria, namely the numeric match value, the best reference match and a string showing the modifications of the comparison string to align with the reference string. These extra fields can support downstream processing and decision-making beyond simple filtering.
This node may only be used for private and non-commercial purposes. Commercial use requires a valid license from exorbyte GmbH. All rights reserved.
For more information contact consulting@exorbyte.com .
- Type: TableReference TableTable containing the canonical string values to match against.
- Type: TableComparison TableTable containing the values to be compared with the reference.