MclustDR: Dimension reduction for model-based clustering and classification

Description

A dimension reduction method for visualizing the clustering or classification structure obtained from a finite mixture of Gaussian densities.

Usage

MclustDR(object, normalized = TRUE, Sigma, tol = sqrt(.Machine$double.eps))

Arguments

object

An object of class Mclust or MclustDA resulting from a call to, respectively, Mclust or MclustDA.

normalized

Logical. If TRUE directions are normalized to unit norm.

Sigma

Marginal covariance matrix of data. If not provided is estimated by the MLE of observed data.

tol

A tolerance value.

Value

An object of class "MclustDR" with the following components:
callThe matched call
typeA character string specifying the type of model for which the dimension reduction is computed. Currently, possible values are "Mclust" for clustering, and "MclustDA" or "EDDA" for classification.
xThe data matrix.
SigmaThe covariance matrix of the data.
mixcompA numeric vector specifying the mixture component of each data observation.
classA factor specifying the classification of each data observation. For model-based clustering this is equivalent to the corresponding mixture component. For model-based classification this is the known classification.
GThe number of mixture components.
modelNameThe name of the parameterization of the estimated mixture model(s). See mclustModelNames.
muA matrix of means for each mixture component.
sigmaAn array of covariance matrices for each mixture component.
proThe estimated prior for each mixture component.
MThe kernel matrix.
raw.evectorsThe raw eigenvectors from the generalized eigen-decomposition of the kernel matrix, ordered according to the eigenvalues.
evaluesThe eigenvalues from the generalized eigen-decomposition of the kernel matrix.
basisThe basis of the estimated dimension reduction subspace.
std.basisThe basis of the estimated dimension reduction subspace standardized to variables having unit standard deviation.
numdirThe dimension of the projection subspace.
dirThe estimated directions, i.e., the data projected onto the estimated dimension reduction subspace.

Details

The method aims at reducing the dimensionality by identifying a set of linear combinations, ordered by importance as quantified by the associated eigenvalues, of the original features which capture most of the clustering or classification structure contained in the data.

Information on the dimension reduction subspace is obtained from the variation on group means and, depending on the estimated mixture model, on the variation on group covariances (see Scrucca, 2010).

Observations may then be projected onto such a reduced subspace, thus providing summary plots which help to visualize the underlying structure.

References

Scrucca, L. (2010) Dimension reduction for model-based clustering. Statistics and Computing, 20(4), pp. 471-484.

Examples

Run this code

mod = Mclust(iris[,1:4])
dr = MclustDR(mod)
summary(dr)

data(banknote)

da = MclustDA(banknote[,2:7], banknote$Status, modelType = "EDDA")
dr = MclustDR(da)
summary(dr)

da = MclustDA(banknote[,2:7], banknote$Status)
dr = MclustDR(da)
summary(dr)

Run the code above in your browser using DataLab