Each FASTA is loaded and parsed such that one output row of the table contains the data for 1 sequence in the file. The node attempts to parse the header block according to the standard options supplied, as indicated:
- GenBank >gi|{gi-number}|gb|{accession}|{locus}
- EMBL Data Library >gi|{gi-number}|emb|{accession}|{locus}
- DDBJ, DNA Database of Japan >gi|{gi-number}|dbj|{accession}|{locus}
- NBRF PIR >pir||{entry}
- Protein Research Foundation >prf||{name}
- SWISS-PROT >sp|{accession}|{name} or >tr|{accession}|{name}
- PDB >{PDB ID}:{chain}|PDBID|CHAIN|SEQUENCE
- Patents >pat|{country}|{number}
- GenInfo Backbone Id >bbs|{number}
- General database identifier >gnl|{database}|{identifier}
- NCBI Reference Sequence >ref|{accession}|{locus}
- Local Sequence identifier >lcl|{identifier}
- Other (No properties extracted)
This node was developed by Vernalis Research . For feedback and more information, please contact knime@vernalis.com