Hub
Pricing About
  • Software
  • Blog
  • Forum
  • Events
  • Documentation
  • About KNIME
  • KNIME Community Hub
  • pikairos
  • Spaces
  • Public
  • 20220718 Pikairos How to Optimize Parallelized Substructure Filtering
WorkflowWorkflow

20220718 Pikairos How to Optimize RDKit Parallelized Substructure Filtering

RDKit Chunk Loop
Pikairos profile image

Last edited: 

Drag & drop
Like
Download workflow
Copy short link
Workflow preview
This workflow shows how to combine the "Serial" Chunk Loop nodes with the RDKit Substructure Filtering to balance the workload and have a compromise between serial and parallel execution of substructure searching. I have run the workflow on a laptop with 6 cores & 128 Gigabytes memory and it took 4 hours to run up to successful termination problem A) with 100 million of rows. All the parallelism is achieved by the -RDKit Substructure Filter- node. This node handles itself the parallelism and does not need any further parallelism to be added around it. In fact, if parallelism is added using -Parallel Chunk Loop- nodes, then the two parallelism schemes fight each other against resources, and this is most probably a source of conflict. In other words, it is not recommended to encapsulate two parallel solutions because it generates competition for resources. It is neither recommended to run in parallel two parallelized branches in a workflow for the same reasons. This solution is using all the cores in the computer to achieve the parallelism. "Problem B)" works in the same way to show how to implement the solution when several queries are made on a huge number of molecules.

External resources

  • parallel chunks - use full server power

Used extensions & nodes

Created with KNIME Analytics Platform version 4.5.2
  • Go to item
    KNIME Base Chemistry Types & Nodes Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.0

    knime
  • Go to item
    KNIME Base nodes Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.2

    knime
  • Go to item
    KNIME Data Generation Trusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.5.0

    knime
  • Go to item
    RDKit Nodes Feature Trusted extension

    NIBR

    Version 4.5.0

    manuelschwarze
  1. Go to item
  2. Go to item
  3. Go to item
  4. Go to item
  5. Go to item
  6. Go to item
Loading deployments
Loading ad hoc executions

Legal

By using or downloading the workflow, you agree to our terms and conditions.

Discussion
Discussions are currently not available, please try again later.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • E-Learning course
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • KNIME Open Source Story
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more on KNIME Business Hub
© 2023 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Credits