The density function of a \(G\)-component finite mixture model can be represented as $$ g({\bold{y}}|\Psi)=\sum_{g=1}^{G} \omega_{g} f_{\bold{Y}}({\bold{y}}, \Theta_g), $$ where \(\bold{\Psi} = \bigl(\bold{\Theta}_{1},\cdots, \bold{\Theta}_{G}\bigr)^{\top}\) with \(\bold{\Theta}_g=\bigl({\bold{\omega}}_g, {\bold{\mu}}_g, {{\Sigma}}_g, {\bold{\lambda}}_g\bigr)^{\top}\). Herein, \(f_{\bold{Y}}(\bold{y}, \bold{\Theta}_g)\) accounts for the density function of random vector \(\bold{Y}\) within each component. In the restricted case, \(f_{\bold{Y}}(\bold{y}, \bold{\Theta}_g)\) admits the representation given by $$ {\bold{Y}} \mathop=\limits^d {\bold{\mu}}_{g}+\sqrt{W}{\bold{\lambda}}_{g}\vert{Z}_0\vert + \sqrt{W}{\Sigma}_{g}^{\frac{1}{2}} {\bold{Z}}_1, $$ where \( {\bold{\mu}}_{g} \in {R}^{d} \) is location vector, \( {\bold{\lambda}}_{g} \in {R}^{d} \) is skewness vector, \(\Sigma_{g}\) is a positive definite symmetric dispersion matrix for \(g=1,\cdots,G\). Further, \(W\) is a positive random variable with mixing density function \(f_W(w| \bold{\theta}_{g})\), \( {Z}_0\sim N(0, 1) \), and \( {\bold{Z}}_1\sim N_{d}\bigl( {\bold{0}}, \Sigma_{g}\bigr) \). We note that \(W\), \(Z_0\), and \( {\bold{Z}}_1\) are mutually independent. In the canonical or unrestricted case, \(f_{\bold{Y}}(\bold{y}, \bold{\Theta}_g)\) admits the representation as $$ {\bold{Y}} \mathop=\limits^d {\bold{\mu}}_{g}+\sqrt{W}{\bold{\Lambda}}_{g} \vert\bold{Z}_0\vert + \sqrt{W}{\Sigma}_{g}^{\frac{1}{2}} {\bold{Z}}_1, $$ where \(\bold{\Lambda}_{g}\) is the skewness matrix and random vector \(\bold{Z}_0\) follows a zero-mean normal random vector truncated to the positive hyperplane \(R^{d}\) whose independent marginals have variance unity. We note that in the unrestricted case \(\bold{\Lambda}_{g}\) is a \(d \times d\) diagonal matrix whereas in the canonical case, it is a \(d\times q\) matrix and so, random vector \(\bold{Z}_0\) follows a zero-mean normal random vector truncated to the positive hyperplane \(R^{q}\).
dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant",
skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, N = 3000, log = "FALSE")
Monte Carlo approximated values of mixture model density function.
an \(n\times d\) matrix of observations.
number of components.
a vector of weight parameters (or mixing proportions).
it must be "canonical"
, "restricted"
, or "unrestricted"
. By default model = "restricted"
.
a list of location vectors of G
components.
a list of dispersion matrices of G
components.
a list of skewness vectors of G
components. If model is either "canonical"
or "unrestricted"
, then skewness vector must be given in matrix form of appropriate size.
name of mixing distribution. By default family = "constant"
that corresponds to the finite mixture of multivariate normal (or skew normal) distribution. Other candidates for family name are: "bs" (for Birnbaum-Saunders), "burriii" (for Burr type iii), "chisq" (for chi-square), "exp" (for exponential), "f" (for Fisher), "gamma" (for gamma), "gig" (for generalized inverse-Gaussian), "igamma" (for inverse-gamma), "igaussian" (for inverse-Gaussian), "lindley" (for Lindley), "loglog" (for log-logistic), "lognorm" (for log-normal), "lomax" (for Lomax), "pstable" (for positive \(\alpha\)-stable), "ptstable" (for polynomially tilted \(\alpha\)-stable), "rayleigh" (for Rayleigh), and "weibull" (for Weibull).
a logical statement. By default skewness = "FALSE"
which means that a symmetric model is fitted to each component (cluster). If skewness = "FALSE"
, then a skewed model is fitted to each component.
name of the elements of \(\bold{\theta}\) as the parameter vector of mixing distribution with density function \(f_W(w| \bold{\theta})\). By default it is NULL
.
a list of maximum likelihood estimator for \(\bold{\theta}\) (parameter vector of the mixing distribution with density function \(f_W(w| \bold{\theta})\)), across G
components. By default it is NULL
.
a binary vector whose length depends on type of family. The elements of tick
are either 0
or 1
. If element of tick
is 0
, then the corresponding element of \(\bold{\theta}\) is not considered in the formula of \(f_W(w|{\bold{\theta)}}\) for computing the required posterior expectations. If element of tick
is 1
, then the corresponding element of \(\bold{\theta}\) is considered in the formula of \(f_W(w|{\bold{\theta)}}\). For instance, if family = "gamma"
and either its shape or rate parameter is one, then tick = c(1)
. This is while, if family = "gamma"
and both of the shape and rate parameters are in the formula of \(f_W(w|{\bold{\theta)}}\), then tick = c(1, 1)
. By default tick = NULL
.
an integer number for approximating the \( g({\bold{y}}|\Psi) \). By default \(N = 3000\).
if log = "TRUE"
, then it returns the log of the density function. By default it is log = "FALSE"
.
Mahdi Teimouri
# \donttest{
Y <- c(1, 2)
G <- 2
weight <- rep( 0.5, 2 )
mu1 <- rep( -5, 2 )
mu2 <- rep( 5, 2 )
sigma1 <- matrix( c( 0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 )
sigma2 <- matrix( c( 0.5, 0.20, 0.20, 0.4 ), nrow = 2, ncol = 2 )
lambda1 <- c( 5, -5 )
lambda2 <- c(-5, 5 )
mu <- list( mu1, mu2 )
sigma <- list( sigma1 , sigma2 )
lambda <- list( lambda1, lambda2)
out <- dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family =
"constant", skewness = "TRUE", param = NULL, theta = NULL, tick =
NULL, N = 3000)
# }
Run the code above in your browser using DataLab