Compute and predict the distances to class centroids
This function computes the class centroids and covariance matrix for a training set for determining Mahalanobis distances of samples to each class centroid.
## S3 method for class 'default': classDist(x, y, groups = 5, pca = FALSE, keep = NULL, ...)
## S3 method for class 'classDist': predict(object, newdata, trans = log, ...)
For factor outcomes, the data are split into groups for each class
and the mean and covariance matrix are calculated. These are then
used to compute Mahalanobis distances to the class centers (using
predict.classDist The function will check for non-singular matrices.
For numeric outcomes, the data are split into roughly equal sized
bins based on
groups. Percentiles are used to split the data.
classDist, an object of class
values a list with elements for each class. Each element contains a mean vector for the class centroid and the inverse of the class covariance matrix classes a character vector of class labels pca the results of
pca = TRUE
call the function call p the number of variables n a vector of samples sizes per class
predict.classDist, a matrix with columns for each class. The columns names are the names of the class with the prefix
dist.. In the case of numeric
y, the class labels are the percentiles. For example, of
groups = 9, the variable names would be
Forina et al. CAIMAN brothers: A family of powerful classification and class modeling techniques. Chemometrics and Intelligent Laboratory Systems (2009) vol. 96 (2) pp. 239-245
trainSet <- sample(1:150, 100) distData <- classDist(iris[trainSet, 1:4], iris$Species[trainSet]) newDist <- predict(distData, iris[-trainSet, 1:4]) splom(newDist, groups = iris$Species[-trainSet])