
Last chance! 50% off unlimited learning
Sale ends in
Computes an approximation of the PP-estimators for PCA using the grid search algorithm in the plane.
PcaGrid(x, ...)
# S3 method for default
PcaGrid(x, k = 0, kmax = ncol(x),
scale=FALSE, na.action = na.fail, crit.pca.distances = 0.975, trace=FALSE, ...)
# S3 method for formula
PcaGrid(formula, data = NULL, subset, na.action, ...)
An S4 object of class PcaGrid-class
which is a subclass of the
virtual class PcaRobust-class
.
a formula with no response variable, referring only to numeric variables.
an optional data frame (or similar: see
model.frame
) containing the variables in the
formula formula
.
an optional vector used to select rows (observations) of the
data matrix x
.
a function which indicates what should happen
when the data contain NA
s. The default is set by
the na.action
setting of options
, and is
na.fail
if that is unset. The default is na.omit
.
arguments passed to or from other methods.
a numeric matrix (or data frame) which provides the data for the principal components analysis.
number of principal components to compute. If k
is missing,
or k = 0
, it is set to the number of columns of the data.
It is preferable to investigate the scree plot in order to choose the number
of components and then run again. Default is k=0
.
maximal number of principal components to compute.
Default is kmax=10
. If k
is provided, kmax
does not need to be specified, unless k
is larger than 10.
a value indicating whether and how the variables should be
scaled. If scale = FALSE
(default) or scale = NULL
no scaling is
performed (a vector of 1s is returned in the scale
slot).
If scale = TRUE
the data are scaled to have unit variance. Alternatively it can
be a function like sd
or mad
or a vector of length equal
the number of columns of x
. The value is passed to the underlying function
and the result returned is stored in the scale
slot.
Default is scale = FALSE
criterion to use for computing the cutoff values for the orthogonal and score distances. Default is 0.975.
whether to print intermediate results. Default is trace = FALSE
Valentin Todorov valentin.todorov@chello.at
PcaGrid
, serving as a constructor for objects of class PcaGrid-class
is a generic function with "formula" and "default" methods. For details see PCAgrid
and the relevant references.
C. Croux, P. Filzmoser, M. Oliveira, (2007). Algorithms for Projection-Pursuit Robust Principal Component Analysis, Chemometrics and Intelligent Laboratory Systems, 87, 225.
Todorov V & Filzmoser P (2009), An Object Oriented Framework for Robust Multivariate Analysis. Journal of Statistical Software, 32(3), 1--47. tools:::Rd_expr_doi("10.18637/jss.v032.i03").
# multivariate data with outliers
library(mvtnorm)
x <- rbind(rmvnorm(200, rep(0, 6), diag(c(5, rep(1,5)))),
rmvnorm( 15, c(0, rep(20, 5)), diag(rep(1, 6))))
# Here we calculate the principal components with PCAgrid
pc <- PcaGrid(x, 6)
# we could draw a biplot too:
biplot(pc)
# we could use another objective function, and
# maybe only calculate the first three principal components:
pc <- PcaGrid(x, 3, method="qn")
biplot(pc)
# now we want to compare the results with the non-robust principal components
pc <- PcaClassic(x, k=3)
# again, a biplot for comparision:
biplot(pc)
Run the code above in your browser using DataLab