get_cai: Calculate Codon Adaptation Index (CAI)

Description

get_cai calculates the Codon Adaptation Index (CAI) for each input coding sequence. CAI measures how similar the codon usage of a gene is to that of highly expressed genes, serving as an indicator of translational efficiency. Higher CAI values suggest better adaptation to the translational machinery.

Usage

get_cai(cf, rscu, level = "subfam")

Value

A named numeric vector of CAI values ranging from 0 to 1. Names correspond to sequence identifiers from the input matrix. Values closer to 1 indicate higher similarity to highly expressed genes.

Arguments

cf: A matrix of codon frequencies as calculated by count_codons(). Rows represent sequences and columns represent codons.
rscu: An RSCU table containing CAI weights for each codon. This table should be generated using est_rscu() based on highly expressed genes, or prepared manually with appropriate weight values.
level: Character string specifying the analysis level: "subfam" (default, analyzes codon subfamilies) or "amino_acid" (analyzes at amino acid level).

References

Sharp PM, Li WH. 1987. The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281-1295.

Examples

Run this code

# Calculate CAI for yeast genes based on RSCU of highly expressed genes
heg <- head(yeast_exp[order(-yeast_exp$fpkm), ], n = 500)
cf_all <- count_codons(yeast_cds)
cf_heg <- cf_all[heg$gene_id, ]
rscu_heg <- est_rscu(cf_heg)
cai <- get_cai(cf_all, rscu_heg)
head(cai)
hist(cai, main = "Distribution of CAI values")

Run the code above in your browser using DataLab