translate: Translate nucleic acid sequences

Description

This function translates nucleic acid sequences into the corresponding peptide sequence. It can translate in any of the 3 forward or three reverse sense frames. In the case of reverse sense, the reverse-complement of the sequence is taken. It can translate using the standard (universal) genetic code and also with non-standard codes.

Usage

translate(seq, frame = 0, sens = "F", numcode = 1)

Arguments

seq

an object of class seq.

frame

Frame(s) (0,1,2) to translate. By default the frame 0 is used.

sens

Sense to translate: F for forward sense and R for reverse sense.

numcode

The ncbi genetic code number for translation. By default the standard genetic code is used.

Value

translate returns a vector of character containing the peptide sequence in the standard one-letter IUPAC code. Termination (STOP) codons are translated by the character '*'.

Details

The following genetic codes are described here. The number preceding each code corresponds to numcode.

{ standard } 2{ vertebrate.mitochondrial } 3{ yeast.mitochondrial } 4{ protozoan.mitochondrial+mycoplasma } 5{ invertebrate.mitochondrial } 6{ ciliate+dasycladaceal } 9{ echinoderm+flatworm.mitochondrial } 10{ euplotid } 11{ bacterial+plantplastid } 12{ alternativeyeast } 13{ ascidian.mitochondrial } 14{ alternativeflatworm.mitochondrial } 15{ blepharism } 16{ chlorophycean.mitochondrial } 21{ trematode.mitochondrial } 22{ scenedesmus.mitochondrial } 23{ hraustochytrium.mitochondria }

References

The genetic codes have been taken from the ncbi taxonomy database: http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c. Last update October 05, 2000. The IUPAC one-letter code for aminoacids is described at: http://www.chem.qmul.ac.uk/iupac/AminoAcid/

citation("seqinr")

Examples

Run this code

##
## Toy cds example invented by Leonor Palmeira:
##
toycds <- s2c("tctgagcaaataaatcgg")
translate(seq = toycds) # should be c("S", "E", "Q", "I", "N", "R")
##
## Real cds example:
##
realcds <- read.fasta(File = system.file("sequences/malM.fasta", package ="seqinr"))[[1]]
translate(seq = realcds)
# Biologically correct, only one stop codon at the end
translate(seq = realcds, frame = 3, sens = "R", numcode = 6)
# Biologically meaningless, note the in-frame stop codons

## Need internet connection.
## Translation of the following genbank entry:
##
## AE003734.PE35        Location/Qualifiers    (length=1833 bp)
##      CDS             join(complement(162997..163210),
##                      complement(162780..162919),complement(161238..162090),
##                      146568..146732,146806..147266)
choosebank("genbank")
query("trans", "N=AE003734.PE35")
getTrans(trans$req[[1]])
## Complex transsplicing operations, the correct frame and the correct 
## genetic code are automatically used for translation into protein.

Run the code above in your browser using DataLab