VGAM (version 1.1-6)

dirmultinomial: Fitting a Dirichlet-Multinomial Distribution


Fits a Dirichlet-multinomial distribution to a matrix response.


dirmultinomial(lphi = "logitlink", iphi = 0.10, parallel = FALSE, zero = "M")



Link function applied to the \(\phi\) parameter, which lies in the open unit interval \((0,1)\). See Links for more choices.


Numeric. Initial value for \(\phi\). Must be in the open unit interval \((0,1)\). If a failure to converge occurs then try assigning this argument a different value.


A logical (formula not allowed here) indicating whether the probabilities \(\pi_1,\ldots,\pi_{M-1}\) are to be equal via equal coefficients. Note \(\pi_M\) will generally be different from the other probabilities. Setting parallel = TRUE will only work if you also set zero = NULL because of interference between these arguments (with respect to the intercept term).


An integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. The values must be from the set \(\{1,2,\ldots,M\}\). If the character "M" then this means the numerical value \(M\), which corresponds to linear/additive predictor associated with \(\phi\). Setting zero = NULL means none of the values from the set \(\{1,2,\ldots,M\}\).


An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, rrvglm and vgam.

If the model is an intercept-only model then @misc (which is a list) has a component called shape which is a vector with the \(M\) values \(\pi_j(1/\phi-1)\).


This VGAM family function is prone to numerical problems, especially when there are covariates.


The Dirichlet-multinomial distribution arises from a multinomial distribution where the probability parameters are not constant but are generated from a multivariate distribution called the Dirichlet distribution. The Dirichlet-multinomial distribution has probability function $$P(Y_1=y_1,\ldots,Y_M=y_M) = {N_{*} \choose {y_1,\ldots,y_M}} \frac{ \prod_{j=1}^{M} \prod_{r=1}^{y_{j}} (\pi_j (1-\phi) + (r-1)\phi)}{ \prod_{r=1}^{N_{*}} (1-\phi + (r-1)\phi)}$$ where \(\phi\) is the over-dispersion parameter and \(N_{*} = y_1+\cdots+y_M\). Here, \(a \choose b\) means ``\(a\) choose \(b\)'' and refers to combinations (see choose). The above formula applies to each row of the matrix response. In this VGAM family function the first \(M-1\) linear/additive predictors correspond to the first \(M-1\) probabilities via $$\eta_j = \log(P[Y=j]/ P[Y=M]) = \log(\pi_j/\pi_M)$$ where \(\eta_j\) is the \(j\)th linear/additive predictor (\(\eta_M=0\) by definition for \(P[Y=M]\) but not for \(\phi\)) and \(j=1,\ldots,M-1\). The \(M\)th linear/additive predictor corresponds to lphi applied to \(\phi\).

Note that \(E(Y_j) = N_* \pi_j\) but the probabilities (returned as the fitted values) \(\pi_j\) are bundled together as a \(M\)-column matrix. The quantities \(N_*\) are returned as the prior weights.

The beta-binomial distribution is a special case of the Dirichlet-multinomial distribution when \(M=2\); see betabinomial. It is easy to show that the first shape parameter of the beta distribution is \(shape1=\pi(1/\phi-1)\) and the second shape parameter is \(shape2=(1-\pi)(1/\phi-1)\). Also, \(\phi=1/(1+shape1+shape2)\), which is known as the intra-cluster correlation coefficient.


Paul, S. R., Balasooriya, U. and Banerjee, T. (2005). Fisher information matrix of the Dirichlet-multinomial distribution. Biometrical Journal, 47, 230--236.

Tvedebrink, T. (2010). Overdispersion in allelic counts and \(\theta\)-correction in forensic genetics. Theoretical Population Biology, 78, 200--210.

Yu, P. and Shaw, C. A. (2014). An Efficient Algorithm for Accurate Computation of the Dirichlet-Multinomial Log-Likelihood Function. Bioinformatics, 30, 1547--54.

See Also

dirmul.old, betabinomial, betabinomialff, dirichlet, multinomial.


Run this code
nn <- 5; M <- 4; set.seed(1)
ydata <- data.frame(round(matrix(runif(nn * M, max = 100), nn, M)))  # Integer counts
colnames(ydata) <- paste("y", 1:M, sep = "")

fit <- vglm(cbind(y1, y2, y3, y4) ~ 1, dirmultinomial,
            data = ydata, trace = TRUE)
depvar(fit)  # Sample proportions
weights(fit, type = "prior", matrix = FALSE)  # Total counts per row

# }
ydata <- transform(ydata, x2 = runif(nn))
fit <- vglm(cbind(y1, y2, y3, y4) ~ x2, dirmultinomial,
            data = ydata, trace = TRUE)
coef(fit, matrix = TRUE)
(sfit <- summary(fit))
# }

Run the code above in your browser using DataCamp Workspace