NodeSMARTCyp 2.4.2


This node will run an encapsulated SMARTCyp 2.4.2 jar. The results generated should be the same as those generated by the web service ( and SMARTCyp 2.4.2 jar. To allow for this the CDK adapter cell is generated, a molfile String is then produced and passed into SMARTCyp which will then use the older 1.4.x CDK for calculations. Refactoring efforts to generate a node running CDK 1.5.12 resulting in changed to the output results, use this node to generate 2.4.2 based results.

About SMARTCyp:

Description taken from:

SMARTCyp is a method for prediction of which sites in a molecule that are most liable to metabolism by Cytochrome P450. It has been shown to be applicable to metabolism by the isoforms 1A2, 2A6, 2B6, 2C8, 2C19, 2E1, and 3A4 (CYP3A4), and specific models for the isoform 2C9 (CYP2C9) and isoform 2D6 (CYP2D6) are included from version 2.1. CYP3A4, CYP2D6, and CYP2C9 are the three of the most important enzymes in drug metabolism since they are involved in the metabolism of more than half of the drugs used today.

SMARTCyp is Developed by the Deparment of Drug Design and Pharmacology at the University of Copenhagen and is funded by Lhasa Limited. More details can be found at:

Notes on output: If the energy is 999, then there is no matching energy rule. Such sites should not be considered as possible sites of metabolism.

Score, S = E - 8*A - 0.04*SASA

Energy = E
E is an approximate activation energy for the reaction of the catalytic site of a CYP with the molecule at this atom. It is decided by fragment matching of each atom against a lookup table with SMARTS rules and activation energies in kJ/mol.

Accessibility = A
The accessibility is a relative measure of the topological distance for an atom from the center of the molecule, and is always a number between 0.5 (atom at the center) and 1 (atom at the end).

Solvent Accessible Surface Area = SASA The SASA describes the local accessibility of an atom and is computed using the 2DSASA algorithm which predicts this value from the molecular topology (for an exact value 3D coordinates would be necessary).

Known limitations: Since the SMARTCyp method relies heavily on reactivity there are some specific types of sites which often are ranked too high or too low. Sites with very low 3D accessibility are ranked too high Sites which are found as metabolites only due to entropy are ranked too low (for example tertbutyl groups which have nine identical hydrogen atoms). For really large compounds (more than 40 non-hydrogen atoms in CYP3A4) the reactive sites found by SMARTCyp are usually not the experimentally found metabolites. This is probably because the sites of metabolism for such large compounds depend heavily on the binding conformation.

Changes since version 1 The changes revolve around retaining the atom numbering of an input structure. The generation of a CDK Cell results in renumbering of the structures atoms. This prevents comparison of predicted sites of metabolism with experimental results. Passing in a MolValue or SdfValue will retain the original atom numbering.

  1. The node no longer outputs a CDK cell in the second output table
  2. If given an SdfValue or MolValue cell value this will be passed directly to SMARTCyp. The standardized structure SMARTCyp processed is output as a MolCell
  3. Other formats are processed using the CDK Adapter cell.

Additional information:

  • Citing SMARTCyp
  • Interpreting results:

Input Ports

  1. Port Type: Data
    A table containing a structure column compatible with CDK adapter cells

Output Ports

  1. Port Type: Data
    A summary of the SMARTCyp predicted sites for the Standard, CYP2C and CYP2D6 models.
  2. Port Type: Data
    This output table details each site(to the selected cutoff). The ID column used will allow grouping of the results to the desired input molecule.