This node pre-filtes molecules by viabilty in the specified MMP schema. The user can specify the number of cuts to be made (1 - 10), and whether Hydrogens should be added (for 1 cut only).
A variety of fragmentation options are included:
- "All acyclic single bonds" - Any acyclic single bond between any two atoms will be broken. This is the most exhaustive approach, but can generate a large number of pairs (rSMARTS: [*:1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*])
- "Only acyclic single bonds to rings" - Single acyclic bonds between any atoms will be broken, as long as at least one atom is in a ring (rSMARTS: [*;R:1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*]).
- "Only acyclic single bonds to either rings or to double bonds exocyclic to rings" - single acyclic bonds between any atoms will be broken, as long as 1 atom is either in a ring, or in a double bond exocyclic to a ring, with the other end in the ring (rSMARTS: [*:1]!@!=!#[*;!R0,$(*=!@[*!R0]):2]>>[*:1]-[*].[*:2]-[*])
- "Only single bonds to a heteroatom" - Single acyclic bonds between any two atoms, at least one of which is not Carbon will be broken. Included to mirror C-X bond breaking chemistry prevalent in modern drug discovery (e.g. SNAr, Reductive Aminations, Amide formations etc. See Ref. 2) (rSMARTS: [!#6:1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*])
- "Non-functional group single bonds" - This reproduces the fragmentation pattern used in the original Hussein/Rea paper (See footnote 24, Ref. 1), and also used in the RDKit Python implementation (Ref 3) (rSMARTS: [#6+0;!$(*=,#[!#6]):1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*])
- "User defined" - The user needs to provide their own rSMARTS fragmentation definition, following the guidelines below.
Guidelines for Custom rSMARTS Definition
- '>>' is required to separate reactants and products
- Products require '[*]' to occur twice, for the attachment points (the node will handle the tagging of these)
- Reactants and products require exactly two atom mappings, e.g. :1] and :2] (other values could be used).
- The atom mappings must be two different values
- The same atom mappings must be used for reactants and products
The algorithm is implemented using the RDKit toolkit.
This node was developed by Vernalis Research . For feedback and more information, please contact knime@vernalis.com
1. J. Hussain and C Rea, " Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large datasets ", J. Chem. Inf. Model. , 2010, 50 , 339-348 (DOI: 10.1021/ci900450m ).
2. S. D. Roughley and A. M. Jordan " The Medicinal Chemist’s Toolbox: An Analysis of Reactions Used in the Pursuit of Drug Candidates ", J. Med. Chem. , 2011, 54 , 3451-3479 (DOI: 10.1021/jm200187y )
3. G. Landrum " An Overview of RDKit " (http://www.rdkit.org/docs/Overview.html#the-contrib-directory) (section entitled 'mmpa')