hdmda: Mixture Discriminant Analysis with HD Gaussians

Description

HD-MDA implements mixture discriminant analysis (MDA, Hastie & Tibshirani, 1996) with HD Gaussians instead of full Gaussians. Each class is assumed to be made of several class-specific groups in which the data live in low-dimensional subspaces. From a technical point of view, a clustering is done using hddc in each class.

Usage

hdmda(X,cls,K=1:10,model='AkjBkQkDk',show=FALSE,...)

Value

hdmda returns an 'hdmda' object which is a list containing:

alpha: Estimated prior probabilities for the classes.
prms: Estimated mixture parameters for each class.
kname: The name (level) of each class.

Arguments

X: A matrix or a data frame of observations, assuming the rows are the observations and the columns the variables. Note that NAs are not allowed.
cls: The vector of the class of each observations, its type can be numeric or string.
K: A vector of integers specifying the number of clusters for which the BIC and the parameters are to be calculated; the function keeps the parameters which maximises the BIC. Note that the length of the vector K can't be larger than 20. Default is 1:10.
model: A character string vector, or an integer vector indicating the models to be used. The available models are: "AkjBkQkDk" (default), "AkBkQkDk", "ABkQkDk", "AkjBQkDk", "AkBQkDk", "ABQkDk", "AkjBkQkD", "AkBkQkD", "ABkQkD", "AkjBQkD", "AkBQkD", "ABQkD", "AjBQD", "ABQD". It is not case sensitive and integers can be used instead of names, see details for more information. Several models can be used, if it is, only the results of the one which maximizes the BIC criterion is kept. To run all models, use model="ALL".
show: Use show = TRUE to display some information related to the clustering.
...: Any argument that can be used by the function hddc.

Author

Laurent Berge, Charles Bouveyron and Stephane Girard

Details

Some information on the signification of the model names:

Akj are the parameters of the classes subspaces:

if Akj: each class has its parameters and there is one parameter for each dimension
if Ak: the classes have different parameters but there is only one per class
if Aj: all the classes have the same parameters for each dimension (it's a particular case with a common orientation matrix)
if A: all classes have the same one parameter

Bk are the noises of the classes subspaces:

If Bk: each class has its proper noise
if B: all classes have the same noise

Qk is the orientation matrix of each class:

if Qk: all classes have its proper orientation matrix
if Q: all classes have the same orientation matrix

Dk is the intrinsic dimension of each class:

if Dk: the dimensions are free and proper to each class
if D: the dimension is common to all classes

The model “all” will compute all the models, give their BIC and keep the model with the highest BIC value. Instead of writing the model names, they can also be specified using an integer. 1 represents the most general model (“AkjBkQkDk”) while 14 is the most constrained (“ABQD”), the others number/name matching are given below:

AkjBkQkDk	1	AkjBkQkD	7
AkBkQkDk	2	AkBkQkD	8
ABkQkDk	3	ABkQkD	9
AkjBQkDk	4	AkjBQkD	10
AkBQkDk	5	AkBQkD	11
ABQkDk	6	ABQkD	12
AjBQD	13	ABQD	14

References

C. Bouveyron and C. Brunet (2014), “Model-based clustering of high-dimensional data: A review”, Computational Statistics and Data Analysis, vol. 71, pp. 52-78.

Bouveyron, C. Girard, S. and Schmid, C. (2007), “High Dimensional Discriminant Analysis”, Communications in Statistics: Theory and Methods, vol. 36 (14), pp. 2607-2623.

Bouveyron, C. Celeux, G. and Girard, S. (2011), “Intrinsic dimension estimation by maximum likelihood in probabilistic PCA”, Pattern Recognition Letters, vol. 32 (14), pp. 1706-1713.

Berge, L. Bouveyron, C. and Girard, S. (2012), “HDclassif: An R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data”, Journal of Statistical Software, 46(6), pp. 1-29, url: http://www.jstatsoft.org/v46/i06/.

Hastie, T., & Tibshirani, R. (1996), “Discriminant analysis by Gaussian mixtures”, Journal of the Royal Statistical Society, Series B (Methodological), pp. 155-176.

Examples

Run this code

# Load the Wine data set
data(wine)
cls = wine[,1]; X = scale(wine[,-1])

# A simple use...
out = hdmda(X[1:100,],cls[1:100])
res = predict(out,X[101:nrow(X),])

# Comparison between hdmda and hdda in a CV setup
set.seed(123); nb = 10; Err = matrix(NA,2,nb)
for (i in 1:nb){
  cat('.')
  test = sample(nrow(X),50)
  out0 = lda(X[-test,],cls[-test])
  res0 = predict(out0,X[test,])
  Err[1,i] = sum(res0$class != cls[test]) / length(test)
  out = hdmda(X[-test,],cls[-test],K=1:3,model="AKJBQKDK")
  res = predict(out,X[test,])
  Err[2,i] = sum(res$class != cls[test]) / length(test)
}
cat('\n')
boxplot(t(Err),names=c('LDA','HD-MDA'),col=2:3,ylab="CV classifciation error",
  main='CV classifciation error on Wine data')

Run the code above in your browser using DataLab