rgdirmn: The Generalized Dirichlet Multinomial Distribution

Description

rgdirmn generates random observations from the generalized Dirichlet multinomial distribution. dgdirmn computes the log of the generalized Dirichlet multinomial probability mass function.

Usage

rgdirmn(n, size, alpha, beta)
dgdirmn(Y, alpha, beta)

Arguments

the number of random vectors to generate. When size is a scalar and alpha is a vector, must specify n. When size is a vector and alpha is a matrix, n is optional. The default value of n is the length of size. If given, n should be equal to the length of size.

size

a number or vector specifying the total number of objects that are put into d categories in the generalized Dirichlet multinomial distribution.

alpha

the parameter of the generalized Dirichlet multinomial distribution. alpha is a numerical positive vector or matrix.

For gdirmn, alpha should match the size of Y. If alpha is a vector, it will be replicated $n$ times to match the dimension of Y.

For rdirmn, if alpha is a vector, size must be a scalar. All the random vectors will be drawn from the same alpha and size. If alpha is a matrix, the number of rows should match the length of size. Each random vector will be drawn from the corresponding row of alpha and the corresponding element of size.

beta

the parameter of the generalized Dirichlet multinomial distribution. beta should have the same dimension as alpha.

For rdirm, if beta is a vector, size must be a scalar. All the random samples will be drawn from the same beta and size. If beta is a matrix, the number of rows should match the length of size. Each random vector will be drawn from the corresponding row of beta and the corresponding element of size.

the multivariate count matrix with dimensions $n \times d$, where $n = 1,2, \ldots$ is the number of observations and $d=3,4,\ldots$ is the number of categories.

Value

dgdirmn returns the value of $\log(P(y|\alpha, \beta))$. When Y is a matrix of $n$ rows, the function dgdirmn returns a vector of length $n$.

rgdirmn returns a $n\times d$ matrix of the generated random observations.

Details

$Y=(y_1, \ldots, y_d)$ are the $d$ category count vectors. Given the parameter vector $\alpha = (\alpha_1, \ldots, \alpha_{d-1}), \alpha_j>0$, and $\beta=(\beta_1, \ldots, \beta_{d-1}), \beta_j>0$, the generalized Dirichlet multinomial probability mass function is $$ P(y|\alpha,\beta) =C_{y_1, \ldots, y_d}^{m} \prod_{j=1}^{d-1} \frac{\Gamma(\alpha_j+y_j)}{\Gamma(\alpha_j)} \frac{\Gamma(\beta_j+z_{j+1})}{\Gamma(\beta_j)} \frac{\Gamma(\alpha_j+\beta_j)}{\Gamma(\alpha_j+\beta_j+z_j)} , $$ where $z_j = \sum_{k=j}^d y_k$ and $m = \sum_{j=1}^d y_j$. Here, $C_k^n$, often read as "$n$ choose $k$", refers the number of $k$ combinations from a set of $n$ elements.

The $\alpha$ and $\beta$ parameters can be vectors, like the results from the distribution fitting function, or they can be matrices with $n$ rows, like the estimate from the regression function multiplied by the covariate matrix $exp(X\alpha)$ and $exp(X\beta)$

Examples

Run this code

# NOT RUN {
# example 1
m <- 20
alpha <- c(0.2, 0.5)
beta <- c(0.7, 0.4)
Y <- rgdirmn(10, m, alpha, beta)
dgdirmn(Y, alpha, beta)

# example 2 
set.seed(100)
alpha <- matrix(abs(rnorm(40)), 10, 4)
beta <- matrix(abs(rnorm(40)), 10, 4)
size <- rbinom(10, 10, 0.5)
GDM.rdm <- rgdirmn(size=size, alpha=alpha, beta=beta)
GDM.rdm1 <- rgdirmn(n=20, size=10, alpha=abs(rnorm(4)), beta=abs(rnorm(4)))
# }

Run the code above in your browser using DataLab