Learn R Programming

HMP (version 2.0.1)

Dirichlet.multinomial: Generation of Dirichlet-Multinomial Random Samples

Description

Random generation of Dirichlet-Multinomial samples.

Usage

Dirichlet.multinomial(Nrs, shape)

Arguments

Nrs

A vector specifying the number of reads or sequence depth for each sample.

shape

A vector of Dirichlet parameters for each taxa.

Value

A data matrix of taxa counts where the rows are samples and columns are the taxa.

Details

The Dirichlet-Multinomial distribution is given by (Mosimann, J. E. (1962); Tvedebrink, T. (2010)), $$\textbf{P}\left ({\textbf{X}_i}=x_{i};\left \{ \pi_j \right \},\theta\right )=\frac{N_{i}!}{x_{i1} !,\ldots,x_{iK} !}\frac{\prod_{j=1}^K \prod_{r=1}^{x_{ij}} \left \{ \pi_j \left ( 1-\theta \right )+\left ( r-1 \right )\theta\right \}}{\prod_{r=1}^{N_i}\left ( 1-\theta\right )+\left ( r-1 \right) \theta}$$ where \(\textbf{x}_{i}= \left [ x_{i1}, \ldots, x_{iK} \right ]\) is the random vector formed by K taxa (features) counts (RAD vector), \(N_{i}= \sum_{j=1}^K x_{ij}\) is the total number of reads (sequence depth), \( \left\{ \pi_j \right\}\) are the mean of taxa-proportions (RAD-probability mean), and \(\theta\) is the overdispersion parameter.

Note: Though the test statistic supports an unequal number of reads across samples, the performance has not yet been fully tested.

References

Mosimann, J. E. (1962). On the compound multinomial distribution, the multivariate \(\beta\)-distribution, and correlations among proportions. Biometrika 49, 65-82. Tvedebrink, T. (2010). Overdispersion in allelic counts and theta-correction in forensic genetics. Theor Popul Biol 78, 200-210.

Examples

Run this code
# NOT RUN {
	data(saliva)
	
	### Generate a the number of reads per sample
	### The first number is the number of reads and the second is the number of subjects
	nrs <- rep(15000, 20) 
	
	### Get gamma from the dirichlet-multinomial parameters
	shape <- dirmult(saliva)$gamma
	
	dmData <- Dirichlet.multinomial(nrs, shape)
	dmData[1:5, 1:5]
# }

Run the code above in your browser using DataLab