Hub
Pricing About
WorkflowWorkflow

Remove rows with duplicate values

mlauber71 profile image
Draft Latest edits on 
May 25, 2018 5:45 AM
Drag & drop
Like
Download workflow
Workflow preview
I set up a workflow to demonstrate how this could be done - use group by to calculate how many duplicates there are (note: KNIME should introduce a generic COUNT(*) function - I had to use a variable) - if the count is larger then 1 it is a duplicate - left join it back to the original data - sort the data by ID and other variables if you want to keep one of the duplicates - use the LAG column to identify which line is a 2nd, 3rd occurrence of a duplicate - make a rule to keep just a single line of each ID - alternative: just remove all duplicates

External resources

  • New Duplicate Row Filter
  • forum entry
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 4.1.0
  • Go to item
    KNIME CoreTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 3.5.3

    knime

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits