This node allows the user to select one or more of the pre-defined fragmentation types and loops through those selected. The current fragmentation type is exposed as a flow variable.
A variety of fragmentation options are included:
- "All acyclic single bonds" - Any acyclic single bonds between any two atoms will be broken. This is the most exhaustive approach, which can generate a large number of pairs (rSMARTS: [*:1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*])
- "Only acyclic single bonds to rings" - Single acyclic bonds between any atoms will be broken, as long as at least one atom is in a ring (rSMARTS: [*;R:1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*])
- "Only acyclic single bonds to either rings or to double bonds exocyclic to rings" - Single acyclic bonds between any atoms will be broken, as long as 1 atom is either in a ring, or in a double bond exocyclic to a ring, with the other end in the ring (rSMARTS: [*:1]!@!=!#[*;!R0,$(*=!@[*!R0]):2]>>[*:1]-[*].[*:2]-[*])
- "Only single bonds to a heteroatom" - Single acyclic bonds between any two atoms, at least one of which is not Carbon will be broken. Included to mirror C-X bond breaking chemistry prevalent in modern drug discovery (e.g. SNAr, Reductive Aminations, Amide formations etc. See Ref. 2) (rSMARTS: [!#6:1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*])
- "Non-functional group single bonds" - This reproduces the fragmentation pattern used in the original Hussein/Rea paper (See footnote 24, Ref. 1), and also used in the RDKit Python implementation (Ref 3) (rSMARTS: [#6+0;!$(*=,#[!#6]):1]!@!=!#[*:2]>>[*:1]-[*].[*:2]-[*])
- "Matsy (One atom in ring, or a non-sp2 C atom bonded to a non-C atom)" - This reproduces the fragmentation pattern used by NextMove's 'Matsy', i.e. single acyclic bonds between either a ring atom and any other atom, or a heteroatom bonded to a non-sp2 C atom, as described in the Matched Series paper (Ref 4) (rSMARTS: [$([#6!^2]-!@[!#6]),$([*;R]-!@[*]):1]-!@[$([!#6]-!@[#6!^2]),$([*]-!@[*;R]):2]>>[*:1]-[*].[*:2]-[*])
- "Peptide Sidechains" - Acyclic single bonds from Cα to Cβ will be broken. C-H will only be broken for Glycine, and only when explicit H are present (both CH bonds will be broken in this case) (rSMARTS: [C;$(CC(=O)[O,N]);$(CN):1]-!@[$([C]-!@C(C(=O)[N,O])N),$([#1]-!@[CH2](C(=O)[N,O])N):2]>>[*:1]-[*].[*:2]-[*])
- "Nucleic Acid Sidechains" - Acyclic single bonds in the anomeric position between the aromatic base N and sugar will be broken. The minimum requirement is N(Ar)CO(CO)CO to allow for open chain analogues (rSMARTS: [n:1]-!@[$(COC(CO)CO):2]>>[*:1]-[*].[*:2]-[*])
- "User defined" - The user needs to provide their own (r)SMARTS fragmentation definition, following the guidelines below
This node was developed by Vernalis Research . For feedback and more information, please contact knime@vernalis.com
1. J. Hussain and C Rea, " Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large datasets ", J. Chem. Inf. Model. , 2010, 50, 339-348 (DOI: 10.1021/ci900450m )
2. S. D. Roughley and A. M. Jordan, " The Medicinal Chemist�s Toolbox: An Analysis of Reactions Used in the Pursuit of Drug Candidates ", J. Med. Chem. , 2011, 54, 3451-3479 (DOI: 10.1021/jm200187y )
3. G. Landrum, " An Overview of RDKit (http://www.rdkit.org/docs/Overview.html#the-contrib-directory) (section entitled 'mmpa')
4. N. M. O'Boyle, J. Bostrom, R. A. Sayle and A. Gill, " Using Matched Molecular Series as a Predictive Tool To Optimize Biological Activity ", J. Med. Chem. , 2014, 57, 2704-2713 (DOI: 10.1021/jm500022q )