A function to simulate factor loadings matrices and Monte Carlo data sets for common factor models, bifactor models, and IRT models.
simFA(
  Model = list(),
  Loadings = list(),
  CrossLoadings = list(),
  Phi = list(),
  ModelError = list(),
  Bifactor = list(),
  MonteCarlo = list(),
  FactorScores = list(),
  Missing = list(),
  Control = list(),
  Seed = NULL
)loadings A common factor or bifactor
           loadings matrix.
Phi A factor correlation matrix.
urloadings The unrotated loadings matrix.
h2 A vector of item communalities.
h2PopME A vector item communalities that
           may include model approximation error.
Rpop The model-implied population correlation
           matrix.
RpopME The model-implied population
           correlation matrix with model error.
W The factor loadings for the minor factors
           (when ModelError = TRUE). Default = NULL.
Xm That part of the observed scores that
           is due to the minor common factors.
SFSvars  Variances of the Specific Factors
           in the metric of the observed scores.
ModelErrorFitStats A list of model fit
           indices (for the underlying equations, see: Bentler,
           1990; Hu & Bentler, 1999; Marsh, Hau, & Grayson,
           2005; Steiger, 2016):
SRMR_theta Standardized Root Mean
                   Square Residual based on the model that is
                   implied  by the error free major factors
                   only (underlying Rpop),
SRMR_thetahat  Standardized Root
                   Mean Square Residual based on an exploratory
                   factor analysis of the population
                   correlation matrix, RpopME,
CRMR_theta  Correlation Root Mean
                   Square Residual based on the model that is
                   implied  by the error free major factors
                   only (underlying Rpop),
CRMR_thetahat Correlation Root Mean
                   Square Residual  based on an exploratory factor
                   analysis of the population correlation matrix,
                   RpopME,
RMSEA_theta Root Mean Square Error
                   of Approximation (Steiger, 2016) based on the
                   model that is implied  by the error free major
                   factors only (underlying Rpop),
RMSEA_thetahat Root Mean Square
                   Error of Approximation (Steiger, 2016) based
                   on an exploratory factor analysis of the
                   population correlation matrix, RpopME,
CFI_theta  Comparative Fit Index
                   (Bentler, 1990) based on the model that is
                   implied  by the error free major factors
                   only (underlying Rpop),
CFI_thetahat Comparative Fit Index
                   (Bentler, 1990)  based on an exploratory
                   factor analysis of the population
                   correlation matrix, RpopME.
Fm MLE fit function for population
                   target model.
Fb MLE fit function for population
                   baseline model.
DFm Degrees of freedom for
                   population target model.
CovMatrices A list containing:
CovMajor The model implied
                   covariances from the major factors.
CovMinor The model implied
                   covariances from the minor factors.
CovUnique The model implied
                   variances from the uniqueness factors.
Bifactor A list containing:
loadingsHier Factor loadings of the
                   1st order solution of a hierarchical
                   bifactor model.
PhiHier Factor correlations of the
                   1st order solution of a hierarchical bifactor
                   model.
Scores A list containing:
FactorScores Factor scores for the
                   common and uniqueness factors.
FacInd Factor indeterminacy indices
                   for the error free population model.
FacIndME Factor score indeterminacy
                   indices for the population model with model
                   error.
ObservedScores A matrix of model
                   implied ObservedScores. If
                   Thresholds were supplied under
                   Keyword FactorScores,
                   ObservedScores will be transformed
                   into Likert scores.
Monte A list containing output from the
           Monte Carlo simulations if generated.
IRT Factor loadings expressed in the normal
           ogive IRT metric. If Thresholds were given
           then IRT difficulty values will also be returned.
Seed The initial seed for the random
           number generator.
call A copy of the function call.
cn A list of all active and nonactive
           function arguments.
(list)
NFac (scalar) Number of common or group
          factors; defaults to NFac = 3.
NItemPerFac
(scalar) All factors have the same number of primary loadings.
(vector) A vector of length NFac
              specifying the number of primary loadings for
              each factor; defaults to
              NItemPerFac = 3.
Model (character) "orthogonal" or
      "oblique"; defaults to Model = "orthogonal".
(list)
FacPattern (NULL or matrix).
FacPattern = M where M is
                a user-defined factor pattern matrix.
FacPattern = NULL; simFA
                will generate a factor pattern based on
                the arguments specified  under other keywords
                (e.g., Model, CrossLoadings, etc.);
                defaults to FacPattern = NULL.
FacLoadDist  (character) Specifies the
        sampling distribution for the common factor loadings.
        Possible values are "runif", "rnorm",
        "sequential", and "fixed"; defaults
        to FacLoadDist = "runif".
FacLoadRange (vector of length NFac,
        2, or 1); defaults to FacLoadRange = c(.3, .7).
If FacLoadDist = "runif" the vector
                defines the bounds of the uniform distribution;
If FacLoadDist = "rnorm" the vector
                defines the mean and standard deviation of
                the normal distribution from which loadings
                are sampled.
If FacLoadDist = "sequential" the
                vector specifies the lower and upper bound
                of the loadings sequence.
If FacLoadDist = "fixed" and
                FacLoadRange is a vector of length 1
                then all common loadings will equal the constant
                specified in FacLoadRange. If
                FacLoadDist = "fixed" and
                FacLoadRange is a vector of length
                NFac then each factor will have fixed
                loadings as specified by the associated
                element in FacLoadRange.
h2 (vector) An optional vector of communalities
        used to constrain the population communalities to
        user-defined values; defaults to h2 = NULL.
(list)
ProbCrossLoad (scalar) A value in the (0,1)
        interval that determines the probability that a cross
        loading will be present in elements of the loadings
        matrix that do not have salient (primary) factor loadings.
        If set to ProbCrossLoad = 1, a single cross
        loading will be added to each factor;  defaults to
        ProbCrossLoad = 0.
CrossLoadRange (vector of length 2) Controls
        size of the cross loadings; defaults to
        CrossLoadRange = c(.20, .25).
CrossLoadPositions (matrix) Specifies the
        row and column positions of (optional) cross loadings;
        defaults to CrossLoadPositions = NULL.
CrossLoadValues (vector) If
        CrossLoadPositions is specified then
        CrossLoadValues is a vector of user-supplied
        cross-loadings; defaults to CrossLoadValues = NULL.
CrudFactor (scalar) Controls the size of
        tertiary factor loadings. If CrudFactor != 0
        then elements of the loadings matrix with neither
        primary nor secondary (i.e., cross) loadings will
        be sampled from a \[-(CrudFactor), (CrudFactor)\]
        uniform distribution; defaults to CrudFactor = 0.
(list)
MaxAbsPhi (scalar) Upper (absolute) bound
        on factor correlations; defaults to
        MaxAbsPhi = .5.
EigenValPower (scalar) Controls the skewness
        of the eigenvalues of Phi. Larger values of
        EigenValPower result in a Phi spectrum that
        is more right-skewed (and thus closer to a
        unidimensional model); defaults to
        EigenValPower = 2.
PhiType (character); defaults to
        PhiType = "free".
If PhiType = "free" factor correlations
                will be randomly generated under the constraints
                of MaxAbsPhi and EigenValPower.
If PhiType = "fixed" all factor
                correlations will equal the value specified
                in MaxAbsPhi. A fatal error will be
                produced if Phi is not positive
                semidefinite.
If PhiType = "user" the factor
                correlations are defined by the matrix
                specified in UserPhi (see below).
UserPhi (matrix) A positive semidefinite
        (PSD) matrix of user-defined factor correlations;
        defaults to UserPhi = NULL.
(list)
ModelError (logical) If ModelError = TRUE
        model error will be introduced into the factor
        pattern via the method described by Tucker, Koopman,
        and Linn (TKL, 1969); defaults to
        ModelError = FALSE.
W (matrix) An optional user-supplied factor
        loading matrix for the NMinorFac minor common
        factors; defaults to W = NULL.
NMinorFac (scalar) Number of minor factors
        in the TKL model; defaults to NMinorFac = 150.
ModelErrorType (character) If
        ModelErrorType = "U" then ModelErrorVar
        is the proportion of uniqueness variance that is due
        to model error. If ModelErrorType = "V" then
        ModelErrorVar is the proportion of total
        variance that is due to model error; defaults to
        ModelErrorType = "U".
ModelErrorVar (scalar \[0,1\]) The proportion
        of uniqueness (U) or total (V) variance that is due
        to model error; defaults to
        ModelErrorVar = .10.
epsTKL (scalar \[0,1\]) Controls the size
        of the factor loadings in successive minor factors;
        defaults to epsTKL = .20.
Wattempts (scalar > 0)  Maximum number of
        tries when attempting to generate a suitable W
        matrix. Default = 10000.
WmaxLoading (scalar > 0) Threshold value for
        NWmaxLoading. Default  WmaxLoading = .30.
NWmaxLoading (scalar >= 0)  Maximum number
        of absolute loadings >= WmaxLoading in any
        column of W (matrix of model approximation error
        factor loadings). Default NWmaxLoading = 2.
        Under the defaults, no column of W will have 3 or
        more loadings > |.30|.
PrintW (Boolean) If PrintW = TRUE
        then simFA will print the attempt history when
        searching for a suitable W matrix given the
        constraints defined in WmaxLoading and
        NWmaxLoading. Default PrintW = FALSE.
RSpecific (matrix) Optional correlation
        matrix for specific factors;
        defaults to RSpecific = NULL.
(list)
Bifactor (logical) If Bifactor = TRUE
        parameters for the bifactor model will be generated;
        defaults to Bifactor = FALSE.
Hierarchical (logical) If Hierarchical = TRUE
        then a hierarchical Schmid Leiman (1957) bifactor
        model will be generated;
        defaults to Hierarchical = FALSE.
F1FactorDist (character) Specifies the
        sampling distribution for the general factor loadings.
        Possible values are "runif", "rnorm",
        "sequential", and "fixed"; defaults
        to F1FactorDist = "sequential".
F1FactorRange (vector of length 1 or 2)
        Controls the sizes of the general factor loadings in
        non-hierarchical bifactor models; defaults to
        F1FactorRange = c(.4, .7).
If F1FactorDist = "runif", the vector
                of length 2 defines the bounds of the uniform
                distribution, c(lower, upper);
If F1FactorDist = "rnorm", the
                vector defines the mean and standard
                deviation of the normal distribution from
                which loadings are sampled, c(MN, SD).
If F1FactorDist = "sequential",
                the vector specifies the lower and upper
                bound of the loadings sequence, c(lower, upper).
(list)
NSamples (integer) Defines number of Monte
        Carlo Samples; defaults to NSamples = 0.
SampleSize (integer) Sample size for each
        Monte Carlo sample; defaults to SampleSize = 250.
Raw (logical) If Raw = TRUE, simulated
        data sets will contain raw data. If Raw = FALSE,
        simulated data sets will contain correlation matrices;
        defaults to Raw = FALSE.
Thresholds (list) List elements contain
        thresholds for each item. Thresholds are required
        when generating Likert variables.
(list)
FS (logical) If FS = TRUE (true)
        factor scores will be simulated; defaults to
        FS = FALSE.
CFSeed (integer) Optional starting seed for
        the common factor scores; defaults to
        CFSeed = NULL in which case a random seed is
         used.
MCFSeed (integer) Optional starting seed
        for the minor common factor scores; defaults to
        MCFSeed = NULL.
SFSeed (integer) Optional starting seed
        for the specific factor scores; defaults to
        SFSeed = NULL in which case a random seed is
        used.
EFSeed (integer) Optional starting seed
        for the error factor scores; defaults to
        EFSeed = NULL in which case a random seed
        is used. Note that CFSeed, MCFSeed,
        SFSeed, and EFSeed must be different
        numbers (a fatal error is produced when two or more
        seeds are specified as equal).
VarRel (vector) A vector of manifest variable
        reliabilities. The specific factor variance for
        variable i will equal \(VarRel[i] - h^2[i]\)
        (the manifest variable reliability minus its
        commonality). By default, \(VarRel = h^2\)
        (resulting in uniformly zero specific factor
        variances).
Population (logical) If Population =
        TRUE, factor scores will fit the correlational
        constraints of the factor model exactly (e.g., the
        common factors will be orthogonal to the unique
        factors); defaults to Population = FALSE.
NFacScores (scalar) Sample size for the
        factor scores; defaults to NFacScores = 250.
Thresholds (list) A list of quantiles used
        to polychotomize the observed data that will be
        generated from the factor scores.
(list)
Missing (logical) If Missing = TRUE all
        data sets will contain missing values; defaults to
        Missing = FALSE.
Mechanism (character) Specifies the missing
        data mechanism. Currently, the program only supports
        missing completely at random (MCAR):
        Missing = "MCAR".
MSProb (scalar or vector of length
        NVar) Specifies the probability of
        missingness for each variable; defaults to
        MSprob = 0.
(list)
IRT (logical) If IRT = TRUE then
        user-supplied thresholds will be interpreted as
        item intercepts; defaults to IRT = FALSE.
Dparam (scalar).  If Dparam = 1 then item
        intercepts should be scaled in the logistic metric.
        If Dparam = 1.702 then intercepts should be
        scaled in the probit metric.
Maxh2 (scalar) Rows of the loadings matrix
        will be rescaled to have a maximum communality of
        Maxh2; defaults to Maxh2 = .98.
Reflect (logical) If Reflect =
        TRUE loadings on the common factors will be
        randomly reflected; defaults to
        Reflect = FALSE.
(integer) Starting seed for the random number
generator; defaults to Seed = NULL. When no seed
is specified by the user, the program will generate a random
seed.
Niels G. Waller with contributions by Hoang V. Nguyen
For a complete description of simFA's
capabilities, users are encouraged to consult the simFABook
at http://users.cla.umn.edu/~nwaller/simFA/simFABook.pdf.
simFA is a program for exploring factor analysis
models via simulation studies.
After calling simFA  all relevant output can be saved
for further processing by calling one or more of the following
object names.
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107(2), 238--246.
Hu, L.-T. & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1--55.
Marsh, H. W., Hau, K.-T., & Grayson, D. (2005). Goodness of fit in structural equation models. In A. Maydeu-Olivares & J. J. McArdle (Eds.), Multivariate applications book series. Contemporary psychometrics: A festschrift for Roderick P. McDonald (p. 275--340). Lawrence Erlbaum Associates Publishers.
Schmid, J. and Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22(1), 53--61.
Steiger, J. H. (2016). Notes on the Steiger–Lind (1980) handout. Structural Equation Modeling: A Multidisciplinary Journal, 23:6, 777-781.
Tucker, L. R., Koopman, R. F., and Linn, R. L. (1969). Evaluation of factor analytic research procedures by means of simulated correlation matrices. Psychometrika, 34(4), 421--459.
## Not run:
#  Ex 1. Three Factor Simple Structure Model with Cross loadings and
#  Ideal Non salient Loadings
   out <-  simFA(Seed = 1)
   print( round( out$loadings, 2 ) )
# Ex 2. Non Hierarchical bifactor model 3 group factors
# with constant loadings on the general factor
   out <- simFA(Bifactor = list(Bifactor = TRUE,
                                Hierarchical = FALSE,
                                F1FactorRange = c(.4, .4),
                                F1FactorDist = "runif"),
                Seed = 1)
   print( round( out$loadings, 2 ) )
   # Ex 3.  Model Fit Statistics for Population Data with
   # Model Approximation Error. Three Factor model.
       out <- simFA(Loadings = list(FacLoadDist = "fixed",
                                    FacLoadRange = .5),
                    ModelError = list(ModelError = TRUE,
                                      NMinorFac = 150,
                                      ModelErrorType = "V",
                                      ModelErrorVar = .1,
                                      Wattempts = 10000,
                                      epsTKL = .2),
                    Seed = 1)
       print( out$loadings )
       print( out$ModelErrorFitStats[seq(2,8,2)] )
## End(**Not run**)
Run the code above in your browser using DataLab