# dirichlet

##### Fitting a Dirichlet Distribution

Fits a Dirichlet distribution to a matrix of compositions.

- Keywords
- models, regression

##### Usage

`dirichlet(link = "loge", parallel = FALSE, zero = NULL, imethod = 1)`

##### Arguments

- link
Link function applied to each of the \(M\) (positive) shape parameters \(\alpha_j\). See

`Links`

for more choices. The default gives \(\eta_j=\log(\alpha_j)\).- parallel, zero, imethod
See

`CommonVGAMffArguments`

for more information.

##### Details

In this help file the response is assumed to be a \(M\)-column matrix with positive values and whose rows each sum to unity. Such data can be thought of as compositional data. There are \(M\) linear/additive predictors \(\eta_j\).

The Dirichlet distribution is commonly used to model compositional data, including applications in genetics. Suppose \((Y_1,\ldots,Y_{M})^T\) is the response. Then it has a Dirichlet distribution if \((Y_1,\ldots,Y_{M-1})^T\) has density $$\frac{\Gamma(\alpha_{+})} {\prod_{j=1}^{M} \Gamma(\alpha_{j})} \prod_{j=1}^{M} y_j^{\alpha_{j} -1}$$ where \(\alpha_+=\alpha_1+\cdots+\alpha_M\), \(\alpha_j > 0\), and the density is defined on the unit simplex $$\Delta_{M} = \left\{ (y_1,\ldots,y_{M})^T : y_1 > 0, \ldots, y_{M} > 0, \sum_{j=1}^{M} y_j = 1 \right\}. $$ One has \(E(Y_j) = \alpha_j / \alpha_{+}\), which are returned as the fitted values. For this distribution Fisher scoring corresponds to Newton-Raphson.

The Dirichlet distribution can be motivated by considering the random variables \((G_1,\ldots,G_{M})^T\) which are each independent and identically distributed as a gamma distribution with density \(f(g_j)=g_j^{\alpha_j - 1} e^{-g_j} / \Gamma(\alpha_j)\). Then the Dirichlet distribution arises when \(Y_j=G_j / (G_1 + \cdots + G_M)\).

##### Value

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions such as `vglm`

,
`rrvglm`

and `vgam`

.

When fitted, the `fitted.values`

slot of the object contains the
\(M\)-column matrix of means.

##### Note

The response should be a matrix of positive values whose rows
each sum to unity. Similar to this is count data, where probably a
multinomial logit model (`multinomial`

) may be appropriate.
Another similar distribution to the Dirichlet is the
Dirichlet-multinomial (see `dirmultinomial`

).

##### References

Lange, K. (2002)
*Mathematical and Statistical Methods for Genetic Analysis*,
2nd ed. New York: Springer-Verlag.

Forbes, C., Evans, M., Hastings, N. and Peacock, B. (2011)
*Statistical Distributions*,
Hoboken, NJ, USA: John Wiley and Sons, Fourth edition.

##### See Also

##### Examples

```
# NOT RUN {
ddata <- data.frame(rdiric(n = 1000,
shape = exp(c(y1 = -1, y2 = 1, y3 = 0))))
fit <- vglm(cbind(y1, y2, y3) ~ 1, dirichlet,
data = ddata, trace = TRUE, crit = "coef")
Coef(fit)
coef(fit, matrix = TRUE)
head(fitted(fit))
# }
```

*Documentation reproduced from package VGAM, version 1.0-4, License: GPL-3*