Learn R Programming

wrProteo (version 2.0.0.2)

.parseFastaHeader: Parse Fasta Header

Description

Parse fasta header (from UniProt) to extract different annotation fields

Usage

.parseFastaHeader(
  header,
  delim = "|",
  databaseSign = c("sp", "tr", "generic", "conta", "synt", "gi"),
  UniprSep = c("OS=", "OX=", "GN=", "PE=", "SV="),
  asList = FALSE,
  silent = FALSE,
  callFrom = NULL,
  debug = FALSE
)

Value

This function returns (depending on argument asList) a) a matrix with columns: 'db','uniqueIdentifier','entryName','proteinName' and further columns depending on argument UniprSep

of b) a list with matrix of primary parsing (argument delim) and matrix from further parsing (argument UniprSep)

Arguments

header

(character) fasta-header

delim

(character) delimeter (ie primary separator)

databaseSign

(character) characters at beginning right after the '>' (typically specifying the data-base-origin), they will be excluded from the sequance-header

UniprSep

(character) separators for further separating entry-fields if tableOut=TRUE; with these delimeter fields a space is assumed in addition to the separators; see also UniProt-FASTA-headers

asList

(logical) if asList=TRUE,the function returns a list with two matrixes, one for primary parsing and another matrix for further parsing (using UniprSep), otherwise all will be combined in single matrix

silent

(logical) suppress messages

callFrom

(character) allows easier tracking of messages produced

debug

(logical) supplemental messages for debugging

See Also

This function is use by readFasta2, writeFasta2 for writing as fasta; for reading readLines or read.fasta from the package seqinr

Examples

Run this code
.parseFastaHeader(">sp|P00760|TRY1_BOVIN Serine protease 1 OS=Bos taurus OX=9913 GN=PRSS1 PE=1")

Run the code above in your browser using DataLab