Fingerprint Generation and Comparison
This workflow snippet demonstrates the creation of chemical fingerprints from input molecules.
There are a lot of nodes from different (license-bound) extensions available in KNIME which can generate fingerprints, but we're using here nodes from the RDKit and the Vernalis extensions that are both freely available. The Cardinality node is part of the Vernalis extension.
There are several methods available to generate different kind of fingerprints. The number of bits that can be 1 is defined by the method used and can differ between methods. Both fingerprint types used here (Morgan and RDKit) are hashed fingerprints. You can find links with more information on these methods in the workflow description. The resulting Cardinality columns contain the number of occurences of 1 in each fringerprint bit string. Note these are different for each method.
The higher the percentage of 1 in the bit string (i.e., how often 1 occurs in relation to the bit length), the "darker" the fingerprint. Increasing the bit length (using the same method) decreases fingerprint darkness, and therefore the probability of bit collisions. Bit collisions occur when the same bit is set by multiple patterns (i.e., substructures within the molecule).