
Last chance! 50% off unlimited learning
Sale ends in
scordis
calculates the score distances (SD) for a PCA or PLS model. SD is the Mahalanobis distance of the projection of a row observation on the score plan to the center of the score space.
A distance cutoff is computed using a moment estimation of the parameters of a Chi-squared distrbution for SD^2 (see e.g. Pomerantzev 2008). In the function output, column dstand
is a standardized distance defined as dstand > 1
can be considered as extreme.
The Winisi "GH" is also provided (usually considered as extreme if GH > 3).
scordis(
object, X = NULL,
nlv = NULL,
rob = TRUE, alpha = .01
)
matrix with distances, standardized distances and Winisi "GH", for the training set.
matrix with distances, standardized distances and Winisi "GH", for new X-data if any.
cutoff value
A fitted model, output of a call to a fitting function (for example from pcasvd
, plskern
,...).
New X-data.
Number of components (PCs or LVs) to consider.
Logical. If TRUE
, the moment estimation of the distance cutoff is robustified. This can be recommended after robust PCA or PLS on small data sets containing extreme values.
Risk-
M. Hubert, P. J. Rousseeuw, K. Vanden Branden (2005). ROBPCA: a new approach to robust principal components analysis. Technometrics, 47, 64-79.
Pomerantsev, A.L., 2008. Acceptance areas for multivariate classification derived by projection methods. Journal of Chemometrics 22, 601-609. https://doi.org/10.1002/cem.1147
n <- 6 ; p <- 4
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- rnorm(n)
Xtest <- Xtrain[1:3, , drop = FALSE]
nlv <- 3
fm <- pcasvd(Xtrain, nlv = nlv)
scordis(fm)
scordis(fm, nlv = 2)
scordis(fm, Xtest, nlv = 2)
Run the code above in your browser using DataLab