Hub
Pricing About
NodeNode / Learner

Fingerprint Bayesian Learner

ChemistryMining
Drag & drop
Like

(Variant) of Naive Bayes for fingerprint columns, i.e. bitvectors. The learner implements a Naive Bayes like algorithm that incorporates sparsely occupied bits and unbalanced class distributions. Details of the algorithm are described in

Prediction of Biological Targets for Compounds Using Multiple-Category Bayesian Models Trained on Chemogenomics Databases , Nidhi Meir Glick, John W. Davies, and Jeremy L. Jenkins, J. Chem. Inf. Model. , 2006, 46 (3), pp 1124–1133

Node details

Input ports
  1. Type: Table
    Input data with fingerprint column
    The data to learn from. It needs to contain a fingerprint column and a categorical class column.
Output ports
  1. Type: Table
    Leave-one-out Scores
    A table containing the scores of the training data, whereby each row is predicted using a model trained on the n-1 remaining rows (leave-one-out). The table is sorted by descending score; it contains the following columns:
    1. The true class values (copied from the input data).
    2. The leave-one-out score (the sum-of-logs of the on-bits)).
    3. The running error of the target class, i.e. the error on the training data if the current row and all preceding rows were predicted as positive class (as they have a score larger or equal to the row's score).
    4. The running error on the negative class(es), i.e. if all rows below the current line were predicted as negative.
    The threshold that minimizes the sum of both error rates is used as default cutoff in the predictor.
    Note, these scores could also be determined using a Cross-Validation meta node. However, they are provided here as they can be easily computed in a single scan on the training data (as opposed to an expensive cross validation run).
    This table can be very well visualized using a ROC Curve node.
  2. Type: Table
    Bit Scores
    A table representing each bit's importance on the different classes. The table has as many rows as there are bits in the fingerprint. The columns show for each bit position, how often a bit is set in (i) any of the rows and (ii) in rows of the respective target class. The value of the "logP" column is the logarithm of equation (6) in the above cited article. A value smaller than 0 indicates that the bit is uncharacteristic for the target class, a value larger 0 shows a strong characteristic for that bit and class. A value ~0 indicates that there is no or a weak relationship between the bit and the class.
  3. Type: Fingerprint Bayes
    Naive Bayes Model
    The model; it's the input to the predictor node.

Extension

The Fingerprint Bayesian Learner node is part of this extension:

  1. Go to item

Related workflows & nodes

  1. Go to item
  2. Go to item
  3. Go to item

KNIME
Open for Innovation

KNIME AG
Talacker 50
8001 Zurich, Switzerland
  • Software
  • Getting started
  • Documentation
  • Courses + Certification
  • Solutions
  • KNIME Hub
  • KNIME Forum
  • Blog
  • Events
  • Partner
  • Developers
  • KNIME Home
  • Careers
  • Contact us
Download KNIME Analytics Platform Read more about KNIME Business Hub
© 2025 KNIME AG. All rights reserved.
  • Trademarks
  • Imprint
  • Privacy
  • Terms & Conditions
  • Data Processing Agreement
  • Credits