ll{ Package: Rmixmod
Type: Package
Version: 2.0.3
Date: 2014-11-07
License: GPL-3 + file LICENSE
LazyLoad: yes
} The general purpose of the package is to discover, or
explain, group structures in multivariate data sets with
unknown (cluster analysis or clustering) or known class
discriminant analysis or classification). It is an
exploratory data analysis tool for solving clustering and
classification problems. But it can also be regarded as a
semi-parametric tool to estimate densities with Gaussian
mixture distributions and multinomial distributions.
Mathematically, mixture probability density function
(pdf) $f$ is a weighted sum of $K$ components
densities :
$$f({\bf x}_i|\theta) = \sum_{k=1}^{K}p_kh({\bf
x}_i|\lambda_k)$$ where $h(.|{\lambda}_k)$ denotes a
$d$-dimensional distribution parametrized by
$\lambda_k$. The parameters are the mixing
proportions $p_k$ and the component of the
distribution $\lambda_k$.
In the Gaussian case, $h$ is the density of a
Gaussian distribution with mean $\mu_k$ and variance
matrix $\Sigma_k$, and thus $\lambda_k =
(\mu_k,\Sigma_k)$.
In the qualitative case, $h$ is a multinomial
distribution and $\lambda_k=(a_k,\epsilon_k)$ is the
parameter of the distribution.
Estimation of the mixture parameters is performed either
through maximum likelihood via the EM (Expectation
Maximization, Dempster et al. 1977), the SEM
(Stochastic EM, Celeux and Diebolt 1985) algorithm
or through classification maximum likelihood via the CEM
algorithm (Clustering EM, Celeux and Govaert
1992). These three algorithms can be chained to obtain
original fitting strategies (e.g. CEM then EM with
results of CEM) to use advantages of each of them in the
estimation process. As mixture problems usually have
multiple relative maxima, the program will produce
different results, depending on the initial estimates
supplied by the user. If the user does not input his own
initial estimates, some initial estimates procedures are
proposed (random centers for instance).
It is possible to constrain some input parameters. For
example, dispersions can be equal between classes, etc.
In the Gaussian case, fourteen models are implemented.
They are based on the eigenvalue decomposition, are most
generally used. They depend on constraints on the
variance matrix such as same variance matrix between
clusters, spherical variance matrix... and they are
suitable for data sets in any dimension.
In the qualitative case, five multinomial models are
available. They are based on a reparametrization of the
multinomial probabilities.
In both cases, the models and the number of clusters can
be chosen by different criteria : BIC (Bayesian
Information Criterion), ICL (Integrated Completed
Likelihood, a classification version of BIC), NEC
(Entropy Criterion), or Cross-Validation (CV).