`princomp`

performs a principal components analysis on the given
numeric data matrix and returns the results as an object of class
`princomp`

.

`princomp(x, …)`# S3 method for formula
princomp(formula, data = NULL, subset, na.action, …)

# S3 method for default
princomp(x, cor = FALSE, scores = TRUE, covmat = NULL,
subset = rep_len(TRUE, nrow(as.matrix(x))), fix_sign = TRUE, …)

# S3 method for princomp
predict(object, newdata, …)

formula

a formula with no response variable, referring only to numeric variables.

data

an optional data frame (or similar: see
`model.frame`

) containing the variables in the
formula `formula`

. By default the variables are taken from
`environment(formula)`

.

subset

an optional vector used to select rows (observations) of the
data matrix `x`

.

na.action

x

a numeric matrix or data frame which provides the data for the principal components analysis.

cor

a logical value indicating whether the calculation should use the correlation matrix or the covariance matrix. (The correlation matrix can only be used if there are no constant variables.)

scores

a logical value indicating whether the score on each principal component should be calculated.

covmat

fix_sign

Should the signs of the loadings and scores be chosen so that the first element of each loading is non-negative?

…

arguments passed to or from other methods. If `x`

is
a formula one might specify `cor`

or `scores`

.

object

Object of class inheriting from `"princomp"`

.

newdata

An optional data frame or matrix in which to look for
variables with which to predict. If omitted, the scores are used.
If the original fit used a formula or a data frame or a matrix with
column names, `newdata`

must contain columns with the same
names. Otherwise it must contain the same number of columns, to be
used in the same order.

`princomp`

returns a list with class `"princomp"`

containing the following components:

the standard deviations of the principal components.

the matrix of variable loadings (i.e., a matrix
whose columns contain the eigenvectors). This is of class
`"loadings"`

: see `loadings`

for its `print`

method.

the means that were subtracted.

the scalings applied to each variable.

the number of observations.

if `scores = TRUE`

, the scores of the supplied
data on the principal components. These are non-null only if
`x`

was supplied, and if `covmat`

was also supplied if it
was a covariance list. For the formula method,
`napredict()`

is applied to handle the treatment of
values omitted by the `na.action`

.

the matched call.

If relevant.

`princomp`

is a generic function with `"formula"`

and
`"default"`

methods.

The calculation is done using `eigen`

on the correlation or
covariance matrix, as determined by `cor`

. This is done for
compatibility with the S-PLUS result. A preferred method of
calculation is to use `svd`

on `x`

, as is done in
`prcomp`

.

Note that the default calculation uses divisor `N`

for the
covariance matrix.

The `print`

method for these objects prints the
results in a nice format and the `plot`

method produces
a scree plot (`screeplot`

). There is also a
`biplot`

method.

If `x`

is a formula then the standard NA-handling is applied to
the scores (if requested): see `napredict`

.

`princomp`

only handles so-called R-mode PCA, that is feature
extraction of variables. If a data matrix is supplied (possibly via a
formula) it is required that there are at least as many units as
variables. For Q-mode PCA use `prcomp`

.

Mardia, K. V., J. T. Kent and J. M. Bibby (1979).
*Multivariate Analysis*, London: Academic Press.

Venables, W. N. and B. D. Ripley (2002).
*Modern Applied Statistics with S*, Springer-Verlag.

`summary.princomp`

, `screeplot`

,
`biplot.princomp`

,
`prcomp`

, `cor`

, `cov`

,
`eigen`

.

# NOT RUN { require(graphics) ## The variances of the variables in the ## USArrests data vary by orders of magnitude, so scaling is appropriate (pc.cr <- princomp(USArrests)) # inappropriate princomp(USArrests, cor = TRUE) # =^= prcomp(USArrests, scale=TRUE) ## Similar, but different: ## The standard deviations differ by a factor of sqrt(49/50) summary(pc.cr <- princomp(USArrests, cor = TRUE)) loadings(pc.cr) # note that blank entries are small but not zero ## The signs of the columns of the loadings are arbitrary plot(pc.cr) # shows a screeplot. biplot(pc.cr) ## Formula interface princomp(~ ., data = USArrests, cor = TRUE) ## NA-handling USArrests[1, 2] <- NA pc.cr <- princomp(~ Murder + Assault + UrbanPop, data = USArrests, na.action = na.exclude, cor = TRUE) # } # NOT RUN { pc.cr$scores[1:5, ] # } # NOT RUN { ## (Simple) Robust PCA: ## Classical: (pc.cl <- princomp(stackloss)) # } # NOT RUN { ## Robust: (pc.rob <- princomp(stackloss, covmat = MASS::cov.rob(stackloss))) # }