AnaCoDa v0.1.0

0

Monthly downloads

0th

Percentile

Analysis of Codon Data under Stationarity using a Bayesian Framework

Is a collection of models to analyze genome scale codon data using a Bayesian framework. Provides visualization routines and checkpointing for model fittings. Currently published models to analyze gene data for selection on codon usage based on Ribosome Overhead Cost (ROC) are: ROC (Gilchrist et al. (2015) <doi:10.1093/gbe/evv087>), and ROC with phi (Wallace & Drummond (2013) <doi:10.1093/molbev/mst051>). In addition 'AnaCoDa' contains three currently unpublished models. The FONSE (First order approximation On NonSense Error) model analyzes gene data for selection on codon usage against of nonsense error rates. The PA (PAusing time) and PANSE (PAusing time + NonSense Error) models use ribosome footprinting data to analyze estimate ribosome pausing times with and without nonsense error rate from ribosome footprinting data.

Readme

Build Status

AnaCoDa

  • AnaCoDa is a collection of codon models.
  • the release version can be obtained from ...

Examples: Running models

Example 1: Using codon data in the form of CDS in fasta format with one mixture (ROC)

The following example illustrates how you would estimates parameters under the ROC model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

```{r, echo = FALSE} genome <- initializeGenomeObject(file = "genome.fasta") parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, geneAssignment = rep(1, length(genome))) mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50) model <- initializeModelObject(parameter = parameter, model = "ROC") runMCMC(mcmc = mcmc, genome = genome, model = model)


## Example 2: Using codon data in the form of CDS in fasta format with one mixture (FONSE)
The following example illustrates how you would estimates parameters under the FONSE model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

```{r, echo = FALSE}
genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, geneAssignment = rep(1, length(genome)))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "FONSE")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Example 3: Using codon data in the form of Ribosome footprints with one mixture (PA)

The following example illustrates how you would estimates parameters under the PA model of a given set of protein coding genes, assuming the same mutation and selection regime for all genes.

```{r, echo = FALSE} genome <- initializeGenomeObject(file = "rfpcounts.tsv", fasta = FALSE) parameter <- initializeParameterObject(genome = genome, sphi = 1, num.mixtures = 1, gene.assignment = rep(1, length(genome))) mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50) model <- initializeModelObject(parameter = parameter, model = "PA") runMCMC(mcmc = mcmc, genome = genome, model = model)


# Examples: Advanced examples
* As the above examples illustrated the commonalities in the way all models are called. The following example will use the default ROC model for illustration purposes
## Example 4
* multiple mixture distributions with genes being initially randomly assigned to a mixture distribution. The mixture assignment of each gene will be estimated. As the below example shows, only arguments passed to the parameter object have to be adjusted to reflect a change in the number of assumed mixture distributions.

```{r, echo = FALSE}
genome <- initializeGenomeObject(file = "genome.fasta")
parameter <- initializeParameterObject(genome = genome, sphi = c(1,2,3), num.mixtures = 3, geneAssignment = sample(1:3, length(genome), replace=TRUE))
mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50)
model <- initializeModelObject(parameter = parameter, model = "ROC")
runMCMC(mcmc = mcmc, genome = genome, model = model)

Example 5

  • This example is based on the previous one, but instead of estimating the assignemnt of each gene to one of the three mixture distributions, we will fix the mixture assignemt to the initial assignment

{r, echo = FALSE} genome <- initializeGenomeObject(file = "genome.fasta") parameter <- initializeParameterObject(genome = genome, sphi = c(1,2,3), num.mixtures = 3, geneAssignment = sample(1:3, length(genome), replace=TRUE)) mcmc <- initializeMCMCObject(samples = 5000, thinning = 10, adaptive.width=50, est.mix = FALSE) model <- initializeModelObject(parameter = parameter, model = "ROC") runMCMC(mcmc = mcmc, genome = genome, model = model)

Functions in AnaCoDa

Name Description
getCSPEstimates Return Codon Specific Paramters (or write to csv) estimates as data.frame
getCodonCountsForAA Get Codon Counts For Each Amino Acid
aminoAcids Amino acids
codonToAA translates codon to amino acid
acfMCMC Autocorrelation function for the likelihood or posterior trace
addObservedSynthesisRateSet Add gene observed synthesis rates
AAToCodon Amino Acid to codon set
acfCSP Plots ACF for codon specific parameter traces
codons Codons
convergence.test Convergence Test
getExpressionEstimates Returns the estimated phi posterior for a gene
getMixtureAssignmentEstimate Returns mixture assignment estimates for each gene
length.Rcpp_Genome Length of Genome
loadMCMCObject Load MCMC Object
runMCMC Run MCMC
initializeModelObject Model Initialization
initializeParameterObject Initialize Parameter
loadParameterObject Load Parameter Object
plot.Rcpp_FONSEModel Plot Model Object
setRestartSettings Set Restart Settings
initializeCovarianceMatrices Initialize Covariance Matrices
writeParameterObject Write Parameter Object to a File
initializeGenomeObject Genome Initialization
initializeMCMCObject Initialize MCMC
plot.Rcpp_ROCModel Plot Model Object
plot.Rcpp_ROCParameter Plot Parameter
summary.Rcpp_Genome Summary of Genome
writeMCMCObject Write MCMC Object
getTrace extracts an object of traces from a parameter object.
plot.Rcpp_FONSEParameter Plot Parameter
plot.Rcpp_MCMCAlgorithm Plot MCMC algorithm
getNames Gene Names of Genome
getObservedSynthesisRateSet Get gene observed synthesis rates
plot.Rcpp_Trace Plot Trace Object
plotCodonSpecificParameters Plot Codon Specific Parameter
No Results!

Last month downloads

Details

Type Package
Date 2018-01-10
URL https://github.com/clandere/RibModelFramework
NeedsCompilation yes
RcppModules Test_mod, Trace_mod, CovarianceMatrix_mod, MCMCAlgorithm_mod, Model_mod, Parameter_mod, Genome_mod, Gene_mod, SequenceSummary_mod
License GPL (>= 2)
LinkingTo Rcpp
LazyLoad yes
LazyData yes
RoxygenNote 6.0.1
Packaged 2018-01-10 18:27:01 UTC; clandere
Repository CRAN
Date/Publication 2018-01-11 12:01:10 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/AnaCoDa)](http://www.rdocumentation.org/packages/AnaCoDa)