Learn R Programming

seqinr (version 1.0-4)

uco: Codon usage indices

Description

uco calculates some codon usage indices: the codon counts eff, the relative frequencies freq or the Relative Synonymous Codon Usage rscu.

Usage

uco(seq, frame = 0, index = c("eff", "freq", "rscu"), as.data.frame = FALSE)

Arguments

seq
a coding sequence as a vector of chars
frame
an integer (0, 1, 2) giving the frame of the coding sequence
index
codon usage index choice, partial matching is allowed. eff for codon counts, freq for codon relative frequencies, and rscu the RSCU index
as.data.frame
logical. If TRUE: all indices are returned into a data frame.

Value

  • If as.data.frame is TRUE uco returns a data frame with five columns:
  • aaa vector containing the name of amino-acid
  • codona vector containing the corresponding codon
  • effa numeric vector of codon counts
  • freqa numeric vector of codon relative frequencies
  • rscua numeric vector of RSCU index
  • If as.data.frame is FALSE, uco returns one of this index:
  • effa table of codon counts
  • freqa table of codon relative frequencies
  • rscua vector of relative synonymous codon usage values

Details

Codons with ambiguous bases are ignored. RSCU is a simple measure of non-uniform usage of synonymous codons in a coding sequence (Sharp et al. 1986). RSCU values are the number of times a particular codon is observed, relative to the number of times that the codon would be observed for a uniform synonymous codon usage (i.e. all the codons for a given amino-acid have the same probability). In the absence of any codon usage bias, the RSCU values would be 1.00 (this is the case for sequence cds in the exemple thereafter).� A codon that is used less frequently than expected will have an RSCU value of less than 1.00 and vice versa for a codon that is used more frequently than expected. � Do not use correspondence analysis on RSCU tables as this is a source of artifacts (Perriere and Thioulouse 2002). Within-aminoacid correspondence analysis is a simple way to study synonymous codon usage (Charif et al. 2005).

References

citation("seqinr") Sharp, P.M., Tuohy, T.M.F., Mosurski, K.R. (1986) Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucl. Acids. Res., 14:5125-5143. Perriere, G., Thioulouse, J. (2002) Use and misuse of correspondence analysis in codon usage studies. Nucl. Acids. Res., 30:4548-4555. Charif, D., Thioulouse, J., Lobry, J.R., Perriere, G. (2005) Online Synonymous Codon Usage Analyses with the ade4 and seqinR packages. Bioinformatics, 21:545-547. http://pbil.univ-lyon1.fr/members/lobry/repro/bioinfo04/.

Examples

Run this code
## Show all possible codons:
words()
## Make a coding sequence from this:
(cds <- s2c(paste(words(), collapse = "")))
## Get codon counts:
uco(cds, index = "eff")
## Get codon relative frequencies:
uco(cds, index = "freq")
## Get RSCU values:
uco(cds, index = "rscu")
## Show what's happen with ambiguous bases:
uco(s2c("aaannnttt"))
## Use a real coding sequence:
rcds <- read.fasta(File = system.file("sequences/malM.fasta", package = "seqinr"))[[1]]
uco( rcds, index = "freq")
uco( rcds, index = "eff")
uco( rcds, index = "rscu")
uco( rcds, as.data.frame = TRUE)

Run the code above in your browser using DataLab