The purpose of this flow is to identify duplicate invoices.
We prepare a set of invoices labelled as duplicates or non-duplicates.
We have calculated the absolute value of the difference between:
- The amounts of two invoices,
- The dates of two invoices,
- The Levenshtein distance between the invoice numbers,
- The distance between the two BERT vectors of the descriptions.
Next, we create a model trained on this dataset based on the neural network.
Workflow
CI_IA_AP-Duplicate_Invoice_Detection
Used extensions & nodes
Created with KNIME Analytics Platform version 4.7.2 Note: Not all extensions may be displayed.
Legal
By using or downloading the workflow, you agree to our terms and conditions.