NodeBio Sequence Reader

Source

Reads a sequence file into two columns: seq_name and sequence

This implementation uses the Biojava (VERSION 1.7.1) implementation for reading the data

It creates a sequence object that can be accessed by special nodes

The following file formats can be read in Fasta Genbank uniprot embl INSDseq

you also have to specify if the sequences are RNA, DNA, protein sequences

Annotations are stored with the sequence objects

Sample code to access annotation

Sample code to access sequence information

Large sequence files with lots of annotations might not fit into memory

Output ports

  1. sequence object Type: Data
    Sequence object representation of the input file