Carries out model-based clustering or classification using some or all of the 14 parsimonious Skew-t clustering models (STPCM).
stpcm(data=NULL, G=1:3, mnames=NULL,
start=2, label=NULL,
veo=FALSE, da=c(1.0),
nmax=1000, atol=1e-8, mtol=1e-8, mmax=10, burn=5,
pprogress=FALSE, pwarning=TRUE,
stochastic = FALSE, latent_method="standard", seed=123)An object of class vgpcm is a list with components:
A vector of integers indicating the maximum a posteriori classifications for the best model.
A list of all estimated models with parameters returned from the C++ call.
A class of vgpcm_best containing; the number of groups for the best model, the covariance structure, and Bayesian Information Criterion (BIC) value.
The log-likelihood values from fitting the best model.
A matrix giving the raw values upon which map is based.
A G by mnames by 3 dimensional array with values pertaining to BIC calculations. (legacy)
A list object for each cluster pertaining to parameters. (legacy)
The type of object inputted into start.
If there were NAs in the original dataset, a vector of indices referencing the row of the imputed vectors is given.
An object of class stpcm_best is a list with components:
A string containg summarized information about the type of model estimated (Covariance structure and number of groups).
An internal list containing all parameters returned from the C++ call.
Bayesian Index Criterion (positive scale, bigger is better).
Log liklihood from the estimated model.
Number of a parameters in the mode.
The type of object inputted into start.
An integer representing the number of groups.
A string representing the type of covariance matrix (see 14 models).
Convergence status of EM algorithm according to Aitken's Acceleration
A vector of integers indicating the maximum a posteriori classifications for the best model.
If there were NAs in the original dataset, a vector of indices referencing the row of the imputed vectors is given.
All classes contain an internal list called model_obj or model_objs with the following components:
a posteori matrix
An integer representing the number of groups.
A vector of covariance matrices for each group
A vector of location vectors for each group
A vector containg skewness vectors for each group
A vector containing estimated gamma parameters for each group
The data x are either clustered or classified using Skew-t mixture models with some or all of the 14 parsimonious covariance structures described in Celeux & Govaert (1995). The algorithms given by Celeux & Govaert (1995) is used for 12 of the 14 models; the "EVE" and "VVE" models use the algorithms given in Browne & McNicholas (2014). Starting values are very important to the successful operation of these algorithms and so care must be taken in the interpretation of results.
McNicholas, P.D. (2016), Mixture Model-Based Classification. Boca Raton: Chapman & Hall/CRC Press
Browne, R.P. and McNicholas, P.D. (2014). Estimating common principal components in high dimensions. Advances in Data Analysis and Classification 8(2), 217-226.
Wei, Y., Tang, Y. and McNicholas, P.D. (2019), 'Mixtures of generalized hyperbolic distributions and mixtures of skew-t distributions for model-based clustering with incomplete data', Computational Statistics and Data Analysis 130, 18-41.
Celeux, G., Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recognition 28(5), 781-793.
data("sx3")
if (FALSE) {
### estimate "VVV" "EVE"
ax = stpcm(sx3, G=1:3, mnames=c("VVV","EVE"), start=0)
summary(ax)
ax
### estimate all 14 covariance structures
ax = stpcm(sx3, G=1:3, mnames=NULL, start=0)
summary(ax)
ax
### model based classification
sx3.label = c(rep(1,1000),rep(2,1000))
plot(sx3, col=sx3.label)
axl = stpcm(sx3, G=2, mnames=c("VVV", "EVE"), label=sx3.label)
summary(axl)
}
Run the code above in your browser using DataLab