Learn R Programming

seqinr (version 1.0-6)

translate: Translate nucleic acid sequences

Description

This function translates nucleic acid sequences into the corresponding peptide sequence. It can translate in any of the 3 forward or three reverse sense frames. In the case of reverse sense, the reverse-complement of the sequence is taken. It can translate using the standard (universal) genetic code and also with non-standard codes.

Usage

translate(seq, frame = 0, sens = "F", numcode = 1)

Arguments

seq
an object of class seq.
frame
Frame(s) (0,1,2) to translate. By default the frame 0 is used.
sens
Sense to translate: F for forward sense and R for reverse sense.
numcode
The ncbi genetic code number for translation. By default the standard genetic code is used.

Value

  • translate returns a vector of character containing the peptide sequence in the standard one-letter IUPAC code. Termination (STOP) codons are translated by the character '*'.

Details

The following genetic codes are described here. The number preceding each code corresponds to numcode.
  • 1
{ standard } 2{ vertebrate.mitochondrial } 3{ yeast.mitochondrial } 4{ protozoan.mitochondrial+mycoplasma } 5{ invertebrate.mitochondrial } 6{ ciliate+dasycladaceal } 9{ echinoderm+flatworm.mitochondrial } 10{ euplotid } 11{ bacterial+plantplastid } 12{ alternativeyeast } 13{ ascidian.mitochondrial } 14{ alternativeflatworm.mitochondrial } 15{ blepharism } 16{ chlorophycean.mitochondrial } 21{ trematode.mitochondrial } 22{ scenedesmus.mitochondrial } 23{ hraustochytrium.mitochondria }

References

The genetic codes have been taken from the ncbi taxonomy database: http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c. Last update October 05, 2000. The IUPAC one-letter code for aminoacids is described at: http://www.chem.qmul.ac.uk/iupac/AminoAcid/

citation("seqinr")

See Also

For coding sequences obtained from an ACNUC server with query it's better to use the function getTrans so that the relevant genetic code and the relevant frame are automatically used. The genetic codes are given in the object SEQINR.UTIL, a more human readable form is given by the function tablecode.

Examples

Run this code
##
## Toy cds example invented by Leonor Palmeira:
##
toycds <- s2c("tctgagcaaataaatcgg")
translate(seq = toycds) # should be c("S", "E", "Q", "I", "N", "R")
##
## Real cds example:
##
realcds <- read.fasta(File = system.file("sequences/malM.fasta", package ="seqinr"))[[1]]
translate(seq = realcds)
# Biologically correct, only one stop codon at the end
translate(seq = realcds, frame = 3, sens = "R", numcode = 6)
# Biologically meaningless, note the in-frame stop codons

## Need internet connection.
## Translation of the following genbank entry:
##
## AE003734.PE35        Location/Qualifiers    (length=1833 bp)
##      CDS             join(complement(162997..163210),
##                      complement(162780..162919),complement(161238..162090),
##                      146568..146732,146806..147266)
choosebank("genbank")
query("trans", "N=AE003734.PE35")
getTrans(trans$req[[1]])
## Complex transsplicing operations, the correct frame and the correct 
## genetic code are automatically used for translation into protein.

Run the code above in your browser using DataLab