Example Workflow for One-Hot Encoder (Biological Sequences) Component

Workflow

Example Workflow for One-Hot Encoder (Biological Sequences) Component

Draft Latest edits on

In this example workflow we demonstrate the usage of One-Hot Encoder (Biological Sequences) component which is part of the KNIME Verified Components (https://www.knime.com/verified-components). After reading FASTA files using another verified component created for this purpose, we pass the table containing cDNA sequences to the One-Hot Encoder component which turns the sequences to one-hot encoded vectors. We use these one-hot encoded vectors to train a deep learning network (CNN) created using the KNIME keras integration. The data contains cDNA sequences where some of these sequences represent RNAs that are binding preferences to ELAVL1A protein. The model is trained and to predict if a sequence is a binding preference for this particular protein or not. The data used in this workflow are from the following publication: Xiaoyong Pan, Peter Rijnbeek, Junchi Yan, Hong-Bin Shen. Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genomics, 2018, 19:511. Specifically: https://github.com/xypan1232/iDeepS/tree/master/datasets/clip

External resources

Loading deploymentsLoading ad hoc jobs

Legal

By using or downloading the workflow, you agree to our terms and conditions.