This component takes a column containing biological sequences (DNA/RNA/Protein) and creates a one-hot encoded version of the sequences. Through the components configuration, it's possible to select a fitting alphabet as well as the way to handle characters/letters that are not in the selected alphabet. The chosen input column is either replaced or a new column is appended based on the user input during configuration.
A table with sequences to be one-hot encoded
A table with a one-hot encoded column
Used extensions & nodes
Created with KNIME Analytics Platform version 4.2.2