Hub
Pricing About
WorkflowWorkflow

Simple Matched Molecular Pairs (MMP) Example

Matched Molecular PairsMMPMMPAMatched Molecular Pair AnalysisCYP-Inhibition
+2
knime profile image
Draft Latest edits on 
Jul 11, 2017 9:11 AM
Drag & drop
Like
Download workflow
Workflow preview
This workflow provides a simple example of generating matched molecular pairs (MMPs) from a set of compounds and using them to predict models with improved properties - in this case, CYP3A4 inhibition using ChEMBL data. The MMP Molecule Fragment node is configured to make 1 cut, using the original Hussein/Rea schema. As we have not pre-filtered the incoming molecule table, we limit by complexity to 5000 cut combinations, and also filter the fragmentations by the ratio of and minimum number of unchanging atoms. We do not calculate graph distance fingerprints in this example (1 cut only will always return an empty fingerprint), but we do calculate attachment point fingerprints in case we want to restrict the MMP by it's molecular context later. We have passed through all the data columns, and also, for illustrative purposes here, elected to render the fragmentation so we can see what is happening. We are using the ChEMBL Parent ID as the ID (Note therefore that this column appears in the output as 'ID' and not as it's incoming name, even though we select it in the pass-through table). With the fragmentation performed, MMPs are generated. We defined a number of ratio (R/L) and differences (R-L; for log propertied) and a few pass-through properties, including the attachment point fingerprint. NB We also decided to restrict transforms by the change in heavy atom count. The first stage of the node execution is sorting the input table by the 'Key' column, whereafter pair generation is parallelised. After, some filter is performed: we could do a simplistic filter, for any transform which has a negative value (we want less active compounds against CYP3A4!) in the 'PCHEMBL_VALUE (R-L)' column, but that would give us transforms which sometimes improve matters but generally don't. Instead, we use groupby to give the mean and standard deviation. We only want transforms where there are at least 3 examples (the final column in the grouping table), and the mean pCHEMBL value is at least 1 std dev below 0.0. Sorting the table gives the biggest changes first, and we can also look at the effect if tge transform on ALOGP and PSA. Eventually, we apply the transforms with and without filtering: The Apply Transforms node pre-sorts the transforms table, and then applies each transform to the entire molecules table. Two node views show progress in either a simple form, or a more informative format showing the current 'active' transforms. If we try to use the Filter by attachment point fingerprint, we will get a warning at this point as the group/ungroup sequence has lost the Fingerprint column properties which tell the node how to generate the fingerprint. In the below part there is the workaround for fingerprint similarity filtering - use a joiner to attach the properties & fingerprints back to the required set of transforms. Note in this case, a low Tanimoto Similarity threshold is required to get any matching transforms. Notes 1 - The transform will be applied if any of the rows containing it pass the similarity threshold - although the transform is the same, the environments from the molecules it was created from could (will?) be different 2 - If there are multiple matching sites in a molecule, only those which match the environment similarity threshold will be reacted 3 - If there are multiple matching sites, each site will be reacted in turn, with products only resulting from a single transformation returned
Loading deploymentsLoading ad hoc jobs

Used extensions & nodes

Created with KNIME Analytics Platform version 4.1.0
  • Go to item
    ChemAxon/Infocom Marvin Extensions FeatureTrusted extension

    Infocom Corporation

    Version 4.0.0

    infocom
  • Go to item
    KNIME CoreTrusted extension

    KNIME AG, Zurich, Switzerland

    Version 4.1.0

    knime profile image
    knime
  • Go to item
    Vernalis KNIME NodesTrusted extension

    Vernalis Research Ltd, Cambridge, UK

    Version 1.24.4

    vernalis

Legal

By using or downloading the workflow, you agree to our terms and conditions.

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits