Hub
Pricing About
WorkflowWorkflow

20220718 Pikairos How to Optimize RDKit Parallelized Substructure Filtering

RDKitChunk Loop
pikairos profile image
Draft Latest edits on 
Jul 16, 2022 10:17 PM
Drag & drop
Like
Download workflow
Workflow preview
This workflow shows how to combine the "Serial" Chunk Loop nodes with the RDKit Substructure Filtering to balance the workload and have a compromise between serial and parallel execution of substructure searching. I have run the workflow on a laptop with 6 cores & 128 Gigabytes memory and it took 4 hours to run up to successful termination problem A) with 100 million of rows. All the parallelism is achieved by the -RDKit Substructure Filter- node. This node handles itself the parallelism and does not need any further parallelism to be added around it. In fact, if parallelism is added using -Parallel Chunk Loop- nodes, then the two parallelism schemes fight each other against resources, and this is most probably a source of conflict. In other words, it is not recommended to encapsulate two parallel solutions because it generates competition for resources. It is neither recommended to run in parallel two parallelized branches in a workflow for the same reasons. This solution is using all the cores in the computer to achieve the parallelism. "Problem B)" works in the same way to show how to implement the solution when several queries are made on a huge number of molecules.

External resources

  • parallel chunks - use full server power
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 4.5.2
  • Go to item
    KNIME Base Chemistry Types & NodesTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.0

    knime
  • Go to item
    KNIME Base nodesTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.2

    knime
  • Go to item
    KNIME Data GenerationTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.0

    knime
  • Go to item
    RDKit Nodes FeatureTrusted extension

    NIBR

    Version 4.5.0

    manuelschwarze

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits