Factanal: Estimate Common Factor Analysis Models

Description

This function estimates models for semi-exploratory factor analysis (SEFA), exploratory factor analysis (EFA), and confirmatory factor analysis (CFA) using a genetic algorithm.

Usage

Factanal(x, factors, data = NULL, covmat = NULL, n.obs = NA, subset, 
na.action, scores = "none", seeds = 12345, lower = sqrt(.Machine$double.eps), 
model = c("SEFA", "EFA", "CFA"), method = c("MLE", "YWLS"), 
restrictions, fixed, criteria = NULL, robust.covmat = FALSE, ...)

Arguments

A formula or a numeric matrix or an object that can be coerced to a numeric matrix. This argument is required if covmat = NULL or if robust.covmat = TRUE and is always recommended if the raw data are available.

factors

The number (>0) of factors to be fitted, which differs from the argument in factanal in that factors can be a numeric vector of length two to indicate the number of factors to ext

data

An optional data frame (or similar: see model.frame), used only if x is a formula. By default the variables are taken from environment(formula).

covmat

A covariance matrix, or a covariance list as returned by cov.wt or similar. If the covariance matrix is really a correlation matrix, it is not (yet) possible to (accurately) calculate some measu

n.obs

The number of observations, used if covmat is a covariance matrix. It is possible to obtain point estimates without knowing the number of observations, but it is not possible to calculate measures of uncertainty.

subset

A specification of the cases to be used, if x is a matrix or formula.

na.action

The na.action to be used if x is used as a formula.

scores

Type of scores to produce, if any. The default is "none". Other valid choices (which can be abbreviated) are "regression", "Bartlett", "Thurstone", "Ledermann", "Anders

seeds

A vector of length one or two to be used as the random number generator seeds corresponding to the unif.seed and int.seed arguments to genoud respectively. If

lower

A lower bound. In exploratory factor analysis using the fitting function in factanal, this argument corresponds to the 'lower' element of the list specified for control

model

A character string indicating "SEFA", "EFA", or "CFA" to
     indicate whether a semi-exploratory, an exploratory, or a confirmatory 
     factor analysis model should be estimated. Defaults to "SEFA".

method

A character string indicating "MLE" or "YWLS" to
     indicate how the model should be estimated. Defaults to "MLE".
     The "YWLS" option uses Yates' (1987) weighted-least squares criterion
     as opposed to most of the weighted-least squa

restrictions

An optional object of class "restrictions". It is
     almost always best to leave this argument unspecified to allow
     Factanal to prompt for restrictions with its pop-up menus. 
     This argument is primarily intended f

fixed

An optional matrix or list of two matrices that specifies
     the values of certain coefficients, which would be utilized most often
     in confirmatory factor analysis and is inappropriate for exploratory
     factor analysis. However,

criteria

An optional list whose elements should be functions or 
     character strings that name functions to be used as criteria during
     the lexical optimization when model != "EFA". It is almost always
     best to leave this a

robust.covmat

A logical indicating whether a minimum covariance
     determinant estimator of the sample covariance matrix should be used. If
     TRUE, this option requires that either robustbase or
     MASS be installed; see

...

Further arguments that are passed to genoud.
     Note that several of the default arguments to genoud
     are silently overridden by Fact

`Value`

An object of formal S4 class "FA", or in the case of two-level models
  an object of formal S4 class "FA.general" or "FA.2ndorder".

`Warning`

Yates' (1987 p.229) weighted least squares criterion has never received much 
  scrutiny. The criterion (but not Yates' algorithm) is included in FAiR so that
  it can be fully evaluated. However, it is not scale invariant, it does not
  lend itself to calculating standard errors or test statistics, and in limited 
  testing seems prone to finding a solution that is geared more toward minimizing 
  the weights than minimizing the squared residuals.

`Details`

SEFA, EFA, and CFA models all estimate the same population model but impose
     different restrictions on the model. If restrictions is unspecified,
     Factanal will create an object that inherits from class
     "restrictions" based on the responses the user gives to the pop-up
     menus. The vignette provides a step-by-step guide to navigating the pop-up
     menus; execute vignette("FAiR") to read it.
     Factanal will then impose different restrictions on the model, 
     depending on the inherited class of this object.
     The CFA model is perhaps the most straightforward in the sense that the
     user specifies that certain coefficients are pegged to particular values, and 
     Factanal estimates the values of the free parameters. These 
     restrictions can be specified via the fixed argument or left 
     unspecified in which case Factanal will prompt you to specify
     the restrictions via a pop-up menu. Factanal relies on the theorem
     in Howe (1955) to overcome rotational indeterminancy. Namely, the factors are
     scaled to have unit variance  and at least $factors - 1$ coefficients per 
     factor are pegged to zero such that a technical rank condition is satisfied.
     This mechanism for eliminating rotational indeterminancy is somewhat more
     limited than the options that are available in some other software packages
     for factor analysis but easily generalizes to SEFA models.
     A SEFA model differs from a CFA model in that the analyst specifies 
     how many coefficients per factor take the value of zero, and the algorithm 
     estimates the locations of these zeros along with the values of the 
     corresponding free coefficients. It is also possible to estimate a mixed 
     SEFA model where some coefficients are fixed to zero (or another number)
     a priori and the locations of the remaining zeros are estimated.
     A SEFA model requires that the Howe (1955) theorem be satisfied and at least
     one additional restriction is imposed. SEFA models are new to the literature and 
     more information about them can be found in Goodrich (2008).
     A EFA model specifies an arbitrary set of restrictions that are minimally
     necessary to extract factors and then a transformation of the factors should
     be obtained using Rotate. By default, the fitting function
     used by Factanal to estimate a EFA model is the same as that used in
     factanal. However, there is an alternative choice that
     estimates a EFA model via a CFA algorithm with the upper triangle of the 
     coefficient matrix filled with zeros. The results, when method = "MLE"
     should be the same --- up to a transformation of the factors --- but in 
     practice can differ if there are optimization failures. The default algorithm
     is considered more reliable at this point, but the alternative algorithm
     must be used if method = "YWLS". Which of these two algorithms is
     used depends on the class of restrictions, which in the usual case
     that it is unspecified will result in a pop-up menu asking which algorithm
     to use.
     It is not necessary to provide starting values for the parameters, since
     there are methods for that purpose. See S4GenericsFAiR. But a
     matrix of starting values can be passed to through the dots to
     genoud. This matrix should have rows equal to the
     pop.size argument in genoud and columns equal
     to the number of free free parameters in the model, which corresponds to
     the nvars argument in genoud. The order of
     the parameters / columns proceeds from the top of the model to
     the bottom as follows. First come the cells that comprise the
     upper triangle of the factor intercorrelation matrix at level two 
     (if there is more than one second-order factor). Next come the 
     free cells of the coefficient matrix at level two in row-major order. 
     Then come the free cells that comprise the upper triange of the factor
     intercorrelation matrix at level one (if a second-order model is not
     estimated). Next come the free cells of the coefficient matrix at level
     one in row-major order. Finally come the diagonal cells of the uniqueness
     matrix. Note that a parameter is free unless it is fixed a priori,
     which is to say that in SEFA models coefficients are considered free
     even if there is a possibility that the algorithm will bind them to zero 
     at the optimum, rendering them not free for the purpose of counting 
     degrees of freedom.

`References`

Barthlomew, D. J.  and Knott, M. (1990) Latent Variable Analysis
    and Factor Analysis. Second Edition, Arnold.
  Beauducel, A. (2007) In spite of indeterminancy, many common factor score
    estimates yield an identical reproduced covariance matrix.
    Psychometrika, 72, 437--441.
  Goodrich, B. (2008) SEFAiR So Far. Unpublished manuscript linked at
    http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:fair#to_paper_s_about_the_ideas_in_fair.
  Smith, G. A. and Stanley G. (1983)
  Clocking $g$: relating intelligence and measures of timed
  performance. Intelligence, 7, 353--368.
  Venables, W. N. and Ripley, B. D. (2002)
  Modern Applied Statistics with S. Fourth edition.  Springer.
  Yates, A. (1987) Multivariate Exploratory Data Analysis:
  A Perspective on Exploratory Factor Analysis. 
  State University of New York Press.

`See Also`

Rotate and factanal

`Examples`

Run this code## Example from Venables and Ripley (2002, p. 323)
## Previously from Bartholomew and Knott  (1999, p. 68--72)
## Originally from Smith and Stanley (1983)

data(ability.cov)
print(ability.cov)

if(TRUE){ # NOTE: One would usually not bother with this block. It just makes the
          # example go quickly and without user intervention on the pop-up menus.
starts1 <- c(0.4551693481819578, 
             0.5893203083906567, 
             0.2182044732474321, 
             0.7694294930481663,
             0.0526383747875095, 
             0.3334323600411430)
starts1 <- matrix(starts1, nrow = 1)

example1 <- new("restrictions.factanal", factors = 2L, nvars = 6L,
                Domains = cbind(sqrt(.Machine$double.eps), rep(1, 6)),
                model = "EFA", method = "MLE", dof = 4L, fast = FALSE)
}

# 'restrictions' and 'starting.values' would typically be left unspecified!
efa <- Factanal(covmat = ability.cov, factors = 2, model = "EFA",
                restrictions = example1, starting.values = starts1)
show(efa)
summary(efa)

# 'criteria' would typically be left unspecified!
efa.rotated <- Rotate(efa, criteria = list("phi"))
summary(efa.rotated)

if(TRUE){ # NOTE: One would usually not bother with this block. It just makes the
          # example go quickly and without user intervention on the pop-up menus.
starts2 <- c(4.46294498156615e-01,
             4.67036349420035e-01,
             6.42220238211291e-01,
             8.88564379236454e-01,
             4.77779639176941e-01,
            -7.13405536379741e-02,
            -9.47782525342137e-08,
             4.04993872375487e-01,
            -1.04604290549591e-08,
            -9.44950629176182e-03,
             2.63078925240678e-04,
             9.38038168787216e-01,
             8.43618801925473e-01,
             4.49024212016027e-01,
             5.87550265675745e-01,
             2.17850254355888e-01,
             7.71724777627142e-01,
             1.20084009542348e-01,
             2.88308011310065e-01)

starts2 <- matrix(starts2, nrow = 1)

Domains <- cbind(-1, 1)
Domains <- rbind(Domains, cbind(-1.5, rep(1.5, 12)))
Domains <- rbind(Domains, cbind(0, rep(1, 6)))
fixed   <- matrix(NA_real_, nrow = 6, ncol = 2)
fix_beta_args <- as.list(formals(FAiR:::FAiR_fix_coefficients))
fix_beta_args$zeros <- c(2,2)
beta_select <- c(FALSE, rep(TRUE, length(fixed)), rep(FALSE, nrow(fixed)))
beta_list <- list(beta = fixed, free = c(is.na(fixed)),
                  num_free = length(fixed), select = beta_select,
                  fix_beta_args = fix_beta_args)
Theta2_list <- list(Theta2 = diag(nrow(fixed)), 
                    select = c(rep(FALSE, length(fixed) + 1),
                               rep(TRUE, nrow(fixed))))
Phi <- diag(c(0.5, 0.5))
example2 <- new("restrictions.1storder", factors = c(2L, 0L),
                Domains = Domains, nvars = nrow(Domains), 
                model = "SEFA", method = "MLE", dof = 6L,
                Phi = Phi, beta = beta_list, Theta2 = Theta2_list,
                criteria = list(llik = FAiR:::FAiR_criterion_llik))
}

# 'restrictions' and 'starting.values' would typically be left unspecified!
sefa <- Factanal(covmat = ability.cov, factors = 2, model = "SEFA",
                 restrictions = example2, starting.values = starts2)
show(sefa)
summary(sefa)

stuff <- list() # output list for various methods, also works on efa and efa.rotated
stuff$model.matrix <- model.matrix(sefa) # sample correlation matrix
stuff$fitted <- fitted(sefa) # reproduced correlation with communalities on diagonal
stuff$residuals <- residuals(sefa) # difference between model.matrix and fitted
stuff$rstandard <- rstandard(sefa) # residual matrix rescaled to a correlation matrix
stuff$weights <- weights(sefa) # (scaled) approximate weights for residuals
stuff$influence <- influence(sefa) # weights * residuals
stuff$logLik <- logLik(sefa) # log-likelihood
stuff$BIC <- BIC(sefa) # BIC
stuff$profile <- profile(sefa) # profile plots of non-free parameters
plot(sefa)  # advanced Scree plot
pairs(sefa) # Thurstone-style plot
Run the code above in your browser using DataLab