# factanal

##### Factor Analysis

Perform maximum-likelihood factor analysis on a covariance matrix or data matrix.

- Keywords
- multivariate

##### Usage

```
factanal(x, factors, data = NULL, covmat = NULL, n.obs = NA,
subset, na.action, start = NULL,
scores = c("none", "regression", "Bartlett"),
rotation = "varimax", control = NULL, ...)
```

##### Arguments

- x
- A formula or a numeric matrix or an object that can be coerced to a numeric matrix.
- factors
- The number of factors to be fitted.
- data
- An optional data frame (or similar: see
`model.frame`

), used only if`x`

is a formula. By default the variables are taken from`environment(formula)`

. - covmat
- A covariance matrix, or a covariance list as returned by
`cov.wt`

. Of course, correlation matrices are covariance matrices. - n.obs
- The number of observations, used if
`covmat`

is a covariance matrix. - subset
- A specification of the cases to be used, if
`x`

is used as a matrix or formula. - na.action
- The
`na.action`

to be used if`x`

is used as a formula. - start
`NULL`

or a matrix of starting values, each column giving an initial set of uniquenesses.- scores
- Type of scores to produce, if any. The default is none,
`"regression"`

gives Thompson's scores,`"Bartlett"`

given Bartlett's weighted least-squares scores. Partial matching allows these names to be abbreviated. - rotation
- character.
`"none"`

or the name of a function to be used to rotate the factors: it will be called with first argument the loadings matrix, and should return a list with component`loadings`

giving the rotated loadings, or just the rotated loadings. - control
- A list of control values, [object Object],[object Object],[object Object],[object Object],[object Object]
- ...
- Components of
`control`

can also be supplied as named arguments to`factanal`

.

##### Details

The factor analysis model is
$$x = \Lambda f + e$$
for a $p$--element vector $x$, a $p \times k$
matrix $\Lambda$ of *loadings*, a $k$--element vector
$f$ of *scores* and a $p$--element vector $e$ of
errors. None of the components other than $x$ is observed, but
the major restriction is that the scores be uncorrelated and of unit
variance, and that the errors be independent with variances
$\Psi$, the *uniquenesses*. It is also common to
scale the observed variables to unit variance, and done in this function.

Thus factor analysis is in essence a model for the correlation matrix
of $x$,
$$\Sigma = \Lambda\Lambda^\prime + \Psi$$
There is still some indeterminacy in the model for it is unchanged
if $\Lambda$ is replaced by $G \Lambda$ for
any orthogonal matrix $G$. Such matrices $G$ are known as
*rotations* (although the term is applied also to non-orthogonal
invertible matrices).

If `covmat`

is supplied it is used. Otherwise `x`

is used
if it is a matrix, or a formula `x`

is used with `data`

to
construct a model matrix, and that is used to construct a covariance
matrix. (It makes no sense for the formula to have a response, and
all the variables must be numeric.) Once a covariance matrix is found
or calculated from `x`

, it is converted to a correlation matrix
for analysis. The correlation matrix is returned as component
`correlation`

of the result.

The fit is done by optimizing the log likelihood assuming multivariate
normality over the uniquenesses. (The maximizing loadings for given
uniquenesses can be found analytically: Lawley & Maxwell (1971,
p.`start`

are tried
in turn and the best fit obtained is used. If `start = NULL`

then the first fit is started at the value suggested by
`control$nstart - 1`

other values are
tried, randomly selected as equal values of the uniquenesses.

The uniquenesses are technically constrained to lie in $[0, 1]$,
but near-zero values are problematical, and the optimization is
done with a lower bound of `control$lower`

, default 0.005
(Lawley & Maxwell, 1971, p.

Scores can only be produced if a data matrix is supplied and used. The first method is the regression method of Thomson (1951), the second the weighted least squares method of Bartlett (1937, 8). Both are estimates of the unobserved scores $f$. Thomson's method regresses (in the population) the unknown $f$ on $x$ to yield $$\hat f = \Lambda^\prime \Sigma^{-1} x$$ and then substitutes the sample estimates of the quantities on the right-hand side. Bartlett's method minimizes the sum of squares of standardized errors over the choice of $f$, given (the fitted) $\Lambda$.

If `x`

is a formula then the standard `NA`

-handling is
applied to the scores (if requested): see `napredict`

.

The `print`

method (documented under `loadings`

)
follows the factor analysis convention of drawing attention to the
patterns of the results, so the default precision is three decimal
places, and small loadings are suppressed.

##### Value

- An object of class
`"factanal"`

with components loadings A matrix of loadings, one column for each factor. The factors are ordered in decreasing order of sums of squares of loadings, and given the sign that will make the sum of the loadings positive. This is of class `"loadings"`

: see`loadings`

for its`print`

method.uniquenesses The uniquenesses computed. correlation The correlation matrix used. criteria The results of the optimization: the value of the criterion (a linear function of the negative log-likelihood) and information on the iterations used. factors The argument `factors`

.dof The number of degrees of freedom of the factor analysis model. method The method: always `"mle"`

.rotmat The rotation matrix if relevant. scores If requested, a matrix of scores. `napredict`

is applied to handle the treatment of values omitted by the`na.action`

.n.obs The number of observations if available, or `NA`

.call The matched call. na.action If relevant. STATISTIC, PVAL The significance-test statistic and P value, if it can be computed.

##### Note

There are so many variations on factor analysis that it is hard to
compare output from different programs. Further, the optimization in
maximum likelihood factor analysis is hard, and many other examples we
compared had less good fits than produced by this function. In
particular, solutions which are

##### encoding

UTF-8

##### References

Bartlett, M. S. (1937) The statistical conception of mental factors.
*British Journal of Psychology*, **28**, 97--104.

Bartlett, M. S. (1938) Methods of estimating mental
factors. *Nature*, **141**, 609--610.

*Statistical Estimation in Factor Analysis.* Almqvist and Wicksell.

Lawley, D. N. and Maxwell, A. E. (1971) *Factor Analysis as a
Statistical Method.* Second edition. Butterworths.

Thomson, G. H. (1951) *The Factorial Analysis of Human Ability.*
London University Press.

##### See Also

`loadings`

(which explains some details of the
`print`

method), `varimax`

, `princomp`

,
`ability.cov`

, `Harman23.cor`

,
`Harman74.cor`

.

Other rotation methods are available in various contributed packages,
including

##### Examples

`library(stats)`

```
# A little demonstration, v2 is just v1 with noise,
# and same for v4 vs. v3 and v6 vs. v5
# Last four cases are there to add noise
# and introduce a positive manifold (g factor)
v1 <- c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
v2 <- c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
v3 <- c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
v4 <- c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
v5 <- c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
v6 <- c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
m1 <- cbind(v1,v2,v3,v4,v5,v6)
cor(m1)
factanal(m1, factors = 3) # varimax is the default
factanal(m1, factors = 3, rotation = "promax")
# The following shows the g factor as PC1
prcomp(m1) # signs may depend on platform
## formula interface
factanal(~v1+v2+v3+v4+v5+v6, factors = 3,
scores = "Bartlett")$scores
## a realistic example from Bartholomew (1987, pp. 61-65)
utils::example(ability.cov)
```

*Documentation reproduced from package stats, version 3.3, License: Part of R 3.3*