Fits the mixture of multiple scaled Student-t distributions to the given data.
mst(X,k,ini="km",sz=NULL,df.min=1,dfU="num",frm="dir",m="BFGS",stop=c(10^-5,200),VB=FALSE)
Data used for clustering.
The number of observations in the data.
The number of features in the data.
Value corresponding to the number of components.
Vector of group membership as determined by the model.
Detect if the point is bad or not per each principal component given the cluster membership.
The number of parameters.
Either a vector of length d, representing the mean value, or a matrix whose rows represent different mean vectors; if it is a matrix, its dimensions must match those of x.
Orthogonal matrix whose columns are the normalized eigenvectors of Sigma.
Diagonal matrix of the eigenvalues of Sigma.
A symmetric positive-definite matrix representing the scale matrix of the distribution.
vector containing the degrees of freedom for each component.
The component membership of each observations.
The indicator if an observation is good or bad with respect to each dimension; 1 is good, and 0 means bad.
The matrix of the expected value of the characteristic weights; corespond to the value of v+(1-v)/eta.
The number of iterations until convergence for the model.
The log-likelihood corresponding to the model.
The Akaike's Information Criterion of the model.
The Bayesian Information Criterion of the model.
The Integrated Completed Likelihood of the model.
The Kullback Information Criterion of the model.
The Bias correction of the Kullback Information Criterion of the model.
The Approximate Weight of Evidence of the model.
Another version of Akaike's Information Criterion of the model.
The Consistent Akaike's Information Criterion of the model.
The AIC version which is used when sample size n is small relative to d.
The Classification Likelihood Criterion of the model.
A matrix or data frame such that rows correspond to observations and columns correspond to variables.
The number of clusters.
Using kmeans by default or "pam" for partition around medoids, "mclust" for Gaussian mixture models, "random.soft" or "random.hard" for random or manual; if "manual", a partition (sz) must be provided.
If initialization is manual, this matrix contains the starting value for z.
Minimum proportion of good points in each group for the contaminated normal distribution.
Criterion to update the degrees of freedom.
Direct by default or indirect, technique used to compute the density function.
Method for the optimization of the eigenvector matrix, see optim for other options.
2-dimensional vector with the Aitken criterion stopping rule and Maximum number of iterations.
If true, tracing information on the progress of the optimization is produced; see optim() for details and plotting of the log-likelihood versus iterations.
Cristina Tortora and Antonio Punzo
Forbes, F. & Wraith, D. (2014). A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering. Statistics and Computing, 24(6), 971--984.
## Not run:
if (FALSE) {
data(sim)
result <- mst(X = sim, k = 2)
plot(result)}
## End(Not run)
Run the code above in your browser using DataLab