Fragments to MMPs

Manipulator

This node implements the Hussain and Rea algorithm for finding Matched Molecular Pairs in a dataset. The node takes an input table of fragments generated by the MMP Molecule Fragment nodes and generates an output table of matched molecular pairs (MMPs)

The node requires two SMILES input columns, representing the 'key' (unchanging atoms) and 'value', and a string column containing the ID. The node will attempt to auto-guess these column selections based on the default names for the columns output by the fragment node.

The input table can contain fragmentations from differing numbers of cuts, in which case this will be reflected in the output table.

The table will be pre-sorted by key followed by value during execution, unless the 'Incoming table is sorted by Keys and Values?' option is selected. If this option is selected and correct sorting is not applied, then pairs may be missed (incorrect keys sorting) or non-canonical in their direction (incorrect values sorting)

Incoming columns can be passed through unchanged (Left, Right or both), numeric columns (Integer, Long, Double and Complex Number) can have differences (L - R or R - L) and ratios (Double only) calculated (L / R or R / L)

Transforms can be filtered based on the Value Attachment point graph distance calculated during fragmentation using a number of options

  • None - No filtering
  • Max total graph distance change - the sum of all graph distance changes
  • Max single graph distance change - the maximum tolerated change in any single distance
  • Tanimoto - the vector Tanimoto similarity
  • Dice - the vector Dice similarity
  • Cosine - the vector Cosine similarity
  • Euclidean - the vector Euclidean distance
  • Hamming - the vector Hamming (Manhattan or City-block) distance
  • Soergel - the vector Soergel distance
Filtering can also be performed based on the change in heavy atom count during the transformation

This node was developed by Vernalis Research . For feedback and more information, please contact knime@vernalis.com

1.J. Hussain and C Rea, " Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large datasets ", J. Chem. Inf. Model. , 2010, 50 , 339-348 (DOI: 10.1021/ci900450m ).

Input Ports

  1. Type: Data
    Fragmented molecule key-value pairs

Output Ports

  1. Type: Data
    Matched pair transformations

Extension

This node is part of the extension

Vernalis KNIME Nodes

v1.20.3

Short Link

Drag node into KNIME Analytics Platform