# prcomp

##### Principal Components Analysis

Performs a principal components analysis on the given data matrix
and returns the results as an object of class `prcomp`

.

- Keywords
- multivariate

##### Usage

`prcomp(x, ...)`## S3 method for class 'formula':
prcomp(formula, data = NULL, subset, na.action, \dots)

## S3 method for class 'default':
prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE,
tol = NULL, \dots)

## S3 method for class 'prcomp':
predict(object, newdata, \dots)

##### Arguments

- formula
- a formula with no response variable, referring only to numeric variables.
- data
- an optional data frame (or similar: see
`model.frame`

) containing the variables in the formula`formula`

. By default the variables are taken from`environment(formula)`

. - subset
- an optional vector used to select rows (observations) of the
data matrix
`x`

. - na.action
- a function which indicates what should happen
when the data contain
`NA`

s. The default is set by the`na.action`

setting of`options`

, and is`na.fail`

if that is unset. Thefactory-fresh default is`na.omit`

. - ...
- arguments passed to or from other methods. If
`x`

is a formula one might specify`scale.`

or`tol`

. - x
- a numeric or complex matrix (or data frame) which provides the data for the principal components analysis.
- retx
- a logical value indicating whether the rotated variables should be returned.
- center
- a logical value indicating whether the variables
should be shifted to be zero centered. Alternately, a vector of
length equal the number of columns of
`x`

can be supplied. The value is passed to`scale`

. - scale.
- a logical value indicating whether the variables should
be scaled to have unit variance before the analysis takes
place. The default is
`FALSE`

for consistency with S, but in general scaling is advisable. Alternatively, a vector of length equal the number of columns of`x`

can be supplied. The value is passed to`scale`

. - tol
- a value indicating the magnitude below which components
should be omitted. (Components are omitted if their
standard deviations are less than or equal to
`tol`

times the standard deviation of the first component.) With the default null setting, no components are omitted. Other settings for tol could be`tol = 0`

or`tol = sqrt(.Machine$double.eps)`

, which would omit essentially constant components. - object
- Object of class inheriting from
`"prcomp"`

- newdata
- An optional data frame or matrix in which to look for
variables with which to predict. If omitted, the scores are used.
If the original fit used a formula or a data frame or a matrix with
column names,
`newdata`

must contain columns with the same names. Otherwise it must contain the same number of columns, to be used in the same order.

##### Details

The calculation is done by a singular value decomposition of the
(centered and possibly scaled) data matrix, not by using
`eigen`

on the covariance matrix. This
is generally the preferred method for numerical accuracy. The
`print`

method for these objects prints the results in a nice
format and the `plot`

method produces a scree plot.

Unlike `princomp`

, variances are computed with the usual
divisor $N - 1$.

Note that `scale = TRUE`

cannot be used if there are zero or
constant (for `center = TRUE`

) variables.

##### Value

`prcomp`

returns a list with class`"prcomp"`

containing the following components:sdev the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the covariance/correlation matrix, though the calculation is actually done with the singular values of the data matrix). rotation the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors). The function `princomp`

returns this in the element`loadings`

.x if `retx`

is true the value of the rotated data (the centred (and scaled if requested) data multiplied by the`rotation`

matrix) is returned. Hence,`cov(x)`

is the diagonal matrix`diag(sdev^2)`

. For the formula method,`napredict()`

is applied to handle the treatment of values omitted by the`na.action`

.center, scale the centering and scaling used, or `FALSE`

.

##### Note

The signs of the columns of the rotation matrix are arbitrary, and so may differ between different programs for PCA, and even between different builds of R.

##### concept

PCA

##### References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
*The New S Language*.
Wadsworth & Brooks/Cole.

Mardia, K. V., J. T. Kent, and J. M. Bibby (1979)
*Multivariate Analysis*, London: Academic Press.

Venables, W. N. and B. D. Ripley (2002)
*Modern Applied Statistics with S*, Springer-Verlag.

##### See Also

##### Examples

`library(stats)`

```
## signs are random
require(graphics)
## the variances of the variables in the
## USArrests data vary by orders of magnitude, so scaling is appropriate
prcomp(USArrests) # inappropriate
prcomp(USArrests, scale = TRUE)
prcomp(~ Murder + Assault + Rape, data = USArrests, scale = TRUE)
plot(prcomp(USArrests))
summary(prcomp(USArrests, scale = TRUE))
biplot(prcomp(USArrests, scale = TRUE))
```

*Documentation reproduced from package stats, version 3.3, License: Part of R 3.3*