Robust Mixture Model
RobMM(X, nclust=2:5, model="Gaussian", ninit=10,
nitermax=50, niterEM=50, niterMC=50, df=3,
epsvp=10^(-4), mc_sample_size=1000, LogLike=-Inf,
init='genie', epsPi=10^-4, epsout=-20,scale='none',
alpha=0.75, c=ncol(X), w=2, epsilon=10^(-8),
criterion='BIC',methodMC="RobbinsMC", par=TRUE,
methodMCM="Weiszfeld")A list with:
A list giving all the results fo the best clustering (chosen with respect to the selected criterion.
A list containing all the results.
The ICL criterion for all the number of classes selected.
The ICL criterion for all the number of classes selected.
The initial data.
A vector of positive integers giving the possible number of clusters.
The number of clusters chosen by the selected criterion.
For the lists bestresult and allresults[[k]]:
A matrix whose rows are the centers of the classes.
A matrix containing all the variance of the classes
The final LogLikelihood.
A matrix giving the probabilities of each data to belong to each class.
The number of iterations of the EM algorithm.
A vector giving the initialized clustering if init='Mclust' or init='genie'.
A vector giving the proportions of each classes.
A vector giving the detected outliers.
A matrix giving the data.
A vector of positive integers giving the possible number of clusters.
The mixture model. Can be 'Gaussian' (by default), 'Student' and 'Laplace'.
The number of random initisalizations. Befault is 10.
The number of iterations for the Weiszfeld algorithm if MethodMCM= 'Weiszfeld'.
The number of iterations for the EM algorithm.
The number of iterations for estimating robustly the variance of each class if methodMC='FixMC' or methodMC='GradMC'.
The degrees of freedom for the Student law if model='Student'.
Run the algorithm on scaled data if scale='robust'.
The minimum values the estimates of the eigenvalues of the Median Covariation Matrix can take. Default is 10^-4.
The number of data generated for the Monte-Carlo method for estimating robustly the variance.
The initial loglikelihood to "beat". Defulat is -Inf.
Can be F if no non random initialization of the algorithm is done, 'genie' if the algorithm is initialized with the help of the function 'genie' of the package genieclust or 'Mclust' if the initialization is done with the function hclass of the package Mclust.
A scalar to ensure the estimates of the probabilities of belonging to a class or uniformly lower bounded by a positive constant.
If the probability of belonging of a data to a class is smaller than exp(epsout), this probbility is replaced by exp(epsout) for calculating the logLikelihood. If the probability is too weak for each class, the data is considered as an outlier. Defautl is -20.
A scalar between 1/2 and 1 used in the stepsequence for the Robbins-Monro method if methodMC='RobbinsMC'.
The constant in the stepsequence if methodMC='RobbinsMC' or methodMC='GradMC'.
The power for the weighted averaged Robbins-Monro algorithm if methodMC='RobbinsMC'.
Stoping condition for the Weiszfeld algorithm.
The criterion for selecting the number of cluster. Can be 'ICL' (default) or 'BIC'.
The method chosen to estimate robustly the variance. Can be 'RobbinsMC', 'GradMC' or 'FixMC'.
Is equal to T if the parallelization of the algorithm is allowed.
The method chosen for estimating the Median Covariation Matrix. Can be 'Gmedian' or 'Weiszfeld'
Cardot, H., Cenac, P. and Zitt, P-A. (2013). Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli, 19, 18-43.
Cardot, H. and Godichon-Baggioni, A. (2017). Fast Estimation of the Median Covariation Matrix with Application to Online Robust Principal Components Analysis. Test, 26(3), 461-480
Vardi, Y. and Zhang, C.-H. (2000). The multivariate L1-median and associated data depth. Proc. Natl. Acad. Sci. USA, 97(4):1423-1426.
See also Gen_MM, RMMplot and RobVar.
if (FALSE) {
ech <- Gen_MM(mu = matrix(c(rep(-2,3),rep(2,3),rep(0,3)),byrow = TRUE,nrow=3))
X <- ech$X
res <- RobMM(X , nclust=3)
RMMplot(res,graph=c('Two_Dim'))
}
Run the code above in your browser using DataLab