# princomp

##### Principal Components Analysis

`princomp`

performs a principal components analysis on the given
numeric data matrix and returns the results as an object of class
`princomp`

.

- Keywords
- multivariate

##### Usage

```
princomp(x, ...)
"princomp"(formula, data = NULL, subset, na.action, ...)
"princomp"(x, cor = FALSE, scores = TRUE, covmat = NULL, subset = rep_len(TRUE, nrow(as.matrix(x))), ...)
"predict"(object, newdata, ...)
```

##### Arguments

- formula
- a formula with no response variable, referring only to numeric variables.
- data
- an optional data frame (or similar: see
`model.frame`

) containing the variables in the formula`formula`

. By default the variables are taken from`environment(formula)`

. - subset
- an optional vector used to select rows (observations) of the
data matrix
`x`

. - na.action
- a function which indicates what should happen
when the data contain
`NA`

s. The default is set by the`na.action`

setting of`options`

, and is`na.fail`

if that is unset. The ‘factory-fresh’ default is`na.omit`

. - x
- a numeric matrix or data frame which provides the data for the principal components analysis.
- cor
- a logical value indicating whether the calculation should use the correlation matrix or the covariance matrix. (The correlation matrix can only be used if there are no constant variables.)
- scores
- a logical value indicating whether the score on each principal component should be calculated.
- covmat
- a covariance matrix, or a covariance list as returned by
`cov.wt`

(and`cov.mve`

or`cov.mcd`

from package \href{https://CRAN.R-project.org/package=#1}{\pkg{#1}}MASSMASS). If supplied, this is used rather than the covariance matrix of`x`

. - ...
- arguments passed to or from other methods. If
`x`

is a formula one might specify`cor`

or`scores`

. - object
- Object of class inheriting from
`"princomp"`

- newdata
- An optional data frame or matrix in which to look for
variables with which to predict. If omitted, the scores are used.
If the original fit used a formula or a data frame or a matrix with
column names,
`newdata`

must contain columns with the same names. Otherwise it must contain the same number of columns, to be used in the same order.

##### Details

`princomp`

is a generic function with `"formula"`

and
`"default"`

methods.

The calculation is done using `eigen`

on the correlation or
covariance matrix, as determined by `cor`

. This is done for
compatibility with the S-PLUS result. A preferred method of
calculation is to use `svd`

on `x`

, as is done in
`prcomp`

.

Note that the default calculation uses divisor `N`

for the
covariance matrix.

The `print`

method for these objects prints the
results in a nice format and the `plot`

method produces
a scree plot (`screeplot`

). There is also a
`biplot`

method.

If `x`

is a formula then the standard NA-handling is applied to
the scores (if requested): see `napredict`

.

`princomp`

only handles so-called R-mode PCA, that is feature
extraction of variables. If a data matrix is supplied (possibly via a
formula) it is required that there are at least as many units as
variables. For Q-mode PCA use `prcomp`

.

##### Value

- sdev
- the standard deviations of the principal components.
- loadings
- the matrix of variable loadings (i.e., a matrix
whose columns contain the eigenvectors). This is of class
`"loadings"`

: see`loadings`

for its`print`

method. - center
- the means that were subtracted.
- scale
- the scalings applied to each variable.
- n.obs
- the number of observations.
- scores
- if
`scores = TRUE`

, the scores of the supplied data on the principal components. These are non-null only if`x`

was supplied, and if`covmat`

was also supplied if it was a covariance list. For the formula method,`napredict()`

is applied to handle the treatment of values omitted by the`na.action`

. - call
- the matched call.
- na.action
- If relevant.

`princomp`

returns a list with class `"princomp"`

containing the following components:
##### Note

The signs of the columns of the loadings and scores are arbitrary, and so may differ between different programs for PCA, and even between different builds of R.

##### References

Mardia, K. V., J. T. Kent and J. M. Bibby (1979).
*Multivariate Analysis*, London: Academic Press.

Venables, W. N. and B. D. Ripley (2002).
*Modern Applied Statistics with S*, Springer-Verlag.

##### See Also

`summary.princomp`

, `screeplot`

,
`biplot.princomp`

,
`prcomp`

, `cor`

, `cov`

,
`eigen`

.

##### Examples

`library(stats)`

```
require(graphics)
## The variances of the variables in the
## USArrests data vary by orders of magnitude, so scaling is appropriate
(pc.cr <- princomp(USArrests)) # inappropriate
princomp(USArrests, cor = TRUE) # =^= prcomp(USArrests, scale=TRUE)
## Similar, but different:
## The standard deviations differ by a factor of sqrt(49/50)
summary(pc.cr <- princomp(USArrests, cor = TRUE))
loadings(pc.cr) # note that blank entries are small but not zero
## The signs of the columns are arbitrary
plot(pc.cr) # shows a screeplot.
biplot(pc.cr)
## Formula interface
princomp(~ ., data = USArrests, cor = TRUE)
## NA-handling
USArrests[1, 2] <- NA
pc.cr <- princomp(~ Murder + Assault + UrbanPop,
data = USArrests, na.action = na.exclude, cor = TRUE)
pc.cr$scores[1:5, ]
## (Simple) Robust PCA:
## Classical:
(pc.cl <- princomp(stackloss))
## Robust:
(pc.rob <- princomp(stackloss, covmat = MASS::cov.rob(stackloss)))
```

*Documentation reproduced from package stats, version 3.2.2, License: Part of R 3.2.2*