Load FASTA Files

Node / Source

Load FASTA Files

Each FASTA is loaded and parsed such that one output row of the table contains the data for 1 sequence in the file. The node attempts to parse the header block according to the standard options supplied, as indicated:

GenBank >gi|{gi-number}|gb|{accession}|{locus}
EMBL Data Library >gi|{gi-number}|emb|{accession}|{locus}
DDBJ, DNA Database of Japan >gi|{gi-number}|dbj|{accession}|{locus}
NBRF PIR >pir||{entry}
Protein Research Foundation >prf||{name}
SWISS-PROT >sp|{accession}|{name} or >tr|{accession}|{name}
PDB >{PDB ID}:{chain}|PDBID|CHAIN|SEQUENCE
Patents >pat|{country}|{number}
GenInfo Backbone Id >bbs|{number}
General database identifier >gnl|{database}|{identifier}
NCBI Reference Sequence >ref|{accession}|{locus}
Local Sequence identifier >lcl|{identifier}
Other (No properties extracted)

This node was developed by Vernalis Research . For feedback and more information, please contact knime@vernalis.com

Node details

Ports Options Views

Input ports

Type: Flow Variable
Flow variables
Optional flow variables containing file path(s)

Output ports

Type: Table
FASTA Sequences
Parsed content of the loaded files

Extension

The Load FASTA Files node is part of this extension:

Go to item