AnaCoDa v0.1.1

0

Monthly downloads

0th

Percentile

Analysis of Codon Data under Stationarity using a Bayesian Framework

Is a collection of models to analyze genome scale codon data using a Bayesian framework. Provides visualization routines and checkpointing for model fittings. Currently published models to analyze gene data for selection on codon usage based on Ribosome Overhead Cost (ROC) are: ROC (Gilchrist et al. (2015) <doi:10.1093/gbe/evv087>), and ROC with phi (Wallace & Drummond (2013) <doi:10.1093/molbev/mst051>). In addition 'AnaCoDa' contains three currently unpublished models. The FONSE (First order approximation On NonSense Error) model analyzes gene data for selection on codon usage against of nonsense error rates. The PA (PAusing time) and PANSE (PAusing time + NonSense Error) models use ribosome footprinting data to analyze estimate ribosome pausing times with and without nonsense error rate from ribosome footprinting data.

Readme

Build Status

AnaCoDa

  • AnaCoDa is a collection of codon models.
  • the release version can be obtained from ...

Examples: Running models

Example 1: Using codon data in the form of CDS in fasta format with one mixture (ROC)

The following example illustrates how you would estimates parameters under the ROC model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

```{r, echo = FALSE} genome <- initializeGenomeObject(file = "genome.fasta") parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, geneAssignment = rep(1, length(genome))) mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50) model <- initializeModelObject(parameter = parameter, model = "ROC") runMCMC(mcmc = mcmc, genome = genome, model = model)


## Example 2: Using codon data in the form of CDS in fasta format with one mixture (FONSE)
The following example illustrates how you would estimates parameters under the FONSE model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

```{r, echo = FALSE}
genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, geneAssignment = rep(1, length(genome)))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "FONSE")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Example 3: Using codon data in the form of Ribosome footprints with one mixture (PA)

The following example illustrates how you would estimates parameters under the PA model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

```{r, echo = FALSE} genome <- initializeGenomeObject(file = "rfpcounts.tsv", fasta = FALSE) parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, gene.assignment = rep(1, length(genome))) mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50) model <- initializeModelObject(parameter = parameter, model = "PA") runMCMC(mcmc = mcmc, genome = genome, model = model)


# Examples: Advanced examples
* As the above examples illustrated the commonalities in the way all models are called. The following example will use the default ROC model for illustration purposes
## Example 4
* multiple mixture distributions with genes being initially randomly assigned to a mixture distribution. The mixture assignment of each gene will be estimated. As the below example shows, only arguments passed to the parameter object have to be adjusted to reflect a change in the number of assumed mixture distributions.

```{r, echo = FALSE}
genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = c(1,2,3), num.mixtures = 3, geneAssignment = sample(1:3, length(genome), replace=TRUE))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "ROC")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Example 5

  • This example is based on the previous one, but instead of estimating the assignemnt of each gene to one of the three mixture distributions, we will fix the mixture assignemt to the initial assignment

{r, echo = FALSE} genome <- initializeGenomeObject(file = "genome.fasta") parameter <- initializeParameterObject(genome = genome, sphi = c(1,2,3), num.mixtures = 3, geneAssignment = sample(1:3, length(genome), replace=TRUE)) mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50, est.mix = FALSE) model <- initializeModelObject(parameter = parameter, model = "ROC") runMCMC(mcmc = mcmc, genome = genome, model = model)

Functions in AnaCoDa

Name Description
acfMCMC Autocorrelation function for the likelihood or posterior trace
getCodonCountsForAA Get Codon Counts For Each Amino Acid
getCAI Calculate the Codon Adaptation Index
loadMCMCObject Load MCMC Object
loadParameterObject Load Parameter Object
getCAIweights Calculate the CAI codon weigths for a reference genome
plot.Rcpp_MCMCAlgorithm Plot MCMC algorithm
codons Codons
plot.Rcpp_ROCModel Plot Model Object
getExpressionEstimates Returns the estimated phi posterior for a gene
getMixtureAssignmentEstimate Returns mixture assignment estimates for each gene
convergence.test Convergence Test
getNames Gene Names of Genome
getNc Calculate the Effective Number of Codons
initializeParameterObject Initialize Parameter
plot.Rcpp_FONSEModel Plot Model Object
plot.Rcpp_FONSEParameter Plot Parameter
aminoAcids Amino acids
setRestartSettings Set Restart Settings
length.Rcpp_Genome Length of Genome
plotCodonSpecificParameters Plot Codon Specific Parameter
summary.Rcpp_Genome Summary of Genome
codonToAA translates codon to amino acid
getSelectionCoefficients Calculate Selection coefficients
AAToCodon Amino Acid to codon set
writeParameterObject Write Parameter Object to a File
initializeCovarianceMatrices Initialize Covariance Matrices
initializeGenomeObject Genome Initialization
writeMCMCObject Write MCMC Object
getTrace extracts an object of traces from a parameter object.
runMCMC Run MCMC
acfCSP Plots ACF for codon specific parameter traces
getNcAA Calculate the Effective Number of Codons for each Amino Acid
getObservedSynthesisRateSet Get gene observed synthesis rates
initializeMCMCObject Initialize MCMC
plot.Rcpp_Trace Plot Trace Object
plot.Rcpp_ROCParameter Plot Parameter
initializeModelObject Model Initialization
addObservedSynthesisRateSet Add gene observed synthesis rates
getCSPEstimates Return Codon Specific Paramters (or write to csv) estimates as data.frame
No Results!

Last month downloads

Details

Type Package
Date 2018-02-12
URL https://github.com/clandere/AnaCoDa
NeedsCompilation yes
RcppModules Test_mod, Trace_mod, CovarianceMatrix_mod, MCMCAlgorithm_mod, Model_mod, Parameter_mod, Genome_mod, Gene_mod, SequenceSummary_mod
License GPL (>= 2)
LinkingTo Rcpp
LazyLoad yes
LazyData yes
RoxygenNote 6.0.1
Packaged 2018-02-12 19:33:29 UTC; clandere
Repository CRAN
Date/Publication 2018-02-12 23:14:54 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/AnaCoDa)](http://www.rdocumentation.org/packages/AnaCoDa)