Learn R Programming

fungible (version 1.91)

simFA: Generate Factor Analysis Models and Data Sets for Simulation Studies

Description

A function to simulate factor loadings matrices and Monte Carlo data sets for common factor models and bifactor models.

Usage

simFA(Model = list(), Loadings = list(), CrossLoadings = list(),
  Phi = list(), ModelError = list(), Bifactor = list(),
  MonteCarlo = list(), FactorScores = list(), Missing = list(),
  Control = list(), Seed = NULL)

Arguments

Model

(list)

  • NFac (scalar) Number of common or group factors; defaults to NFac = 3.

  • NItemPerFac

    • (scalar) All factors have the same number of primary loadings.

    • (vector) A vector of length NFac specifying the number of primary loadings for each factor; defaults to NItemPerFac = 3.

  • Model (character) "orthogonal" or "oblique"; defaults to Model = "orthogonal".

Loadings

(list)

  • FacPattern (NULL or matrix).

    • FacPattern = M where M is a user-defined factor pattern matrix.

    • FacPattern = NULL; simFA will generate a factor pattern based on the arguments specified under other keywords (e.g., Model, CrossLoadings, etc.); defaults to FacPattern = NULL.

  • FacLoadDist (character) Specifies the sampling distribution for the common factor loadings. Possible values are "runif", "rnorm", "sequential", and "fixed"; defaults to FacLoadDist = "runif".

  • FacLoadRange (vector of length NFac, 2, or 1); defaults to FacLoadRange = c(.3, .7).

    • If FacLoadDist = "runif" the vector defines the bounds of the uniform distribution;

    • If FacLoadDist = "rnorm" the vector defines the mean and standard deviation of the normal distribution from which loadings are sampled.

    • If FacLoadDist = "sequential" the vector specifies the lower and upper bound of the loadings sequence.

    • If FacLoadDist = "fixed" and FacLoadRange is a vector of length 1 then all common loadings will equal the constant specified in FacLoadRange. If FacLoadDist = "fixed" and FacLoadRange is a vector of length NFac then each factor will have fixed loadings as specified by the associated element in FacLoadRange.

  • h2 (vector) An optional vector of communalities used to constrain the population communalities to user-defined values; defaults to h2 = NULL.

CrossLoadings

(list)

  • ProbCrossLoad (scalar) A value in the (0,1) interval that determines the probability that a cross loading will be present in elements of the loadings matrix that do not have salient (primary) factor loadings. If set to ProbCrossLoad = 1, a single cross loading will be added to each factor; defaults to ProbCrossLoad = 0.

  • CrossLoadRange (vector of length 2) Controls size of the crossloadings; defaults to CrossLoadRange= c(.20, .25).

  • CrossLoadPositions (matrix) Specifies the row and column positions of (optional) cross-loadings; defaults to CrossLoadPositions = NULL.

  • CrossLoadValues (vector) If CrossLoadPositions is specified then CrossLoadValues is a vector of user-supplied cross-loadings; defaults to CrossLoadValues = NULL.

  • CrudFactor (scalar) Controls the size of tertiary factor loadings. If CrudFactor != 0 then elements of the loadings matrix with neither primary nor secondary (i.e., cross) loadings will be sampled from a [-(CrudFactor), (CrudFactor)] uniform distribution; defaults to CrudFactor = 0.

Phi

(list)

  • MaxAbsPhi (scalar) Upper (absolute) bound on factor correlations; defaults to MaxAbsPhi = .5.

  • EigenValPower (scalar) Controls the skewness of the eigenvalues of Phi. Larger values of EigenValPower result in a Phi spectrum that is more right-skewed (and thus closer to a unidimensional model); defaults to EigenValPower = 2.

  • PhiType (character); defaults to PhiType = "free".

    • If PhiType = "free" factor correlations will be randomly generated under the constraints of MaxAbsPhi and EigenValPower.

    • If PhiType = "fixed" all factor correlations will equal the value specified in MaxAbsPhi. A fatal error will be produced if Phi is not positive semidefinite.

    • If PhiType = "user" the factor correlations are defined by the matrix specified in UserPhi (see below).

  • UserPhi (matrix) A positive semidefinite (PSD) matrix of user-defined factor correlations;defaults to UserPhi = NULL.

ModelError

(list)

  • ModelError (logical) If ModelError = TRUE model error will be introduced into the factor pattern via the method described by Tucker, Koopman, and Linn (TKL, 1969); defaults to ModelError = FALSE.

  • NMinorFac (scalar) Number of minor factors in the TKL model; defaults to NMinorFac = 150.

  • ModelErrorType (character) If ModelErrorType = "U" then ModelErrorVar is the proportion of uniqueness variance that is due to model error. If ModelErrorType = "V" then ModelErrorVar is the proportion of total variance that is due to model error; defaults to ModelErrorType = "U".

  • ModelErrorVar (scalar [0,1]) The proportion of uniqueness (U) or total (V) variance that is due to model error; defaults to ModelErrorVar = .10.

  • epsTKL (scalar [0,1]) Controls the size of the factor loadings in successive minor factors; defaults to epsTKL = .20.

  • RSpecific (matrix) Optional correlation matrix for specific factors; defaults to RSpecific = NULL.

Bifactor

(list)

  • Bifactor (logical) If Bifactor = TRUE parameters for the bifactor model will be generated; defaults to Bifactor = FALSE.

  • Hierarchical (logical) If Hierarchical = TRUE then a hierarchical Schmid Leiman (1957) bifactor model will be generated; defaults to Hierarchical = FALSE.

  • F1FactorDist (character) Specifies the sampling distribution for the general factor loadings. Possible values are "runif", "rnorm", "sequential", and "fixed"; defaults to F1FactorDist = "sequential".

  • F1FactorRange (vector of length 1 or 2) Controls the sizes of the general factor loadings in nonhierarchical bifactor models; defaults to F1FactorRange = c(.4, .7).

    • If F1FactorDist = "runif", the vector of length 2 defines the bounds of the uniform distribution, c(lower, upper);

    • If F1FactorDist = "rnorm", the vector defines the mean and standard deviation of the normal distribution from which loadings are sampled, c(MN, SD).

    • If F1FactorDist = "sequential", the vector specifies the lower and upper bound of the loadings sequence, c(lower, upper).

MonteCarlo

(list)

  • NSamples (integer) Defines number of Monte Carlo Samples; defaults to NSamples = 0.

  • SampleSize (integer) Sample size for each Monte Carlo sample; defaults to SampleSize = 250.

  • Raw (logical) If Raw = TRUE, simulated data sets will contain raw data. If Raw = FALSE, simulated data sets will contain correlation matrices; defaults to Raw = FALSE.

  • Thresholds (list) List elements contain thresholds for each item. Thresholds are required when generating Likert variables.

FactorScores

(list)

  • FS (logical) If FS = TRUE (true) factor scores will be simulated; defaults to FS = FALSE.

  • CFSeed (integer) Optional starting seed for the common factor scores; defaults to CFSeed = NULL in which case a random seed is used.

  • SFSeed (integer) Optional starting seed for the specific factor scores; defaults to SFSeed = NULL in which case a random seed is used.

  • EFSeed (integer) Optional starting seed for the error factor scores; defaults to EFSeed = NULL in which case a random seed is used. Note that CFSeed, SFSeed, and EFSeed must be different numbers (a fatal error is produced when two or more seeds are specified as equal).

  • VarRel (vector) A vector of manifest variable reliabilities. The specific factor variance for variable i will equal \(VarRel[i] - h^2[i]\) (the manifest variable reliability minus its commonality). By default, \(VarRel = h^2\) (resulting in uniformly zero specific factor variances).

  • Population (logical) If Population = TRUE, factor scores will fit the correlational constraints of the factor model exactly (e.g., the common factors will be orthogonal to the unique factors); defaults to Population = FALSE.

  • NFacScores (scalar) Sample size for the factor scores; defaults to NFacScores = 250.

  • Thresholds (list) A list of quantiles used to polychotomize the observed data that will be generated from the factor scores.

Missing

(list)

  • Missing (logical) If Missing = TRUE all data sets will contain missing values; defaults to Missing = FALSE.

  • Mechanism (character) Specifies the missing data mechanism. Currently, the program only supports missing completely at random (MCAR): Missing = "MCAR".

  • MSProb (scalar or vector of length NVar) Specifies the probability of missingness for each variable; defaults to MSprob = 0.

Control

(list)

  • Maxh2 (scalar) Rows of the loadings matrix will be rescaled to have a maximum communality of Maxh2; defaults to Maxh2 = .98. itemReflect (logical) If Reflect = TRUE loadings on the common factors will be randomly reflected; defaults to Reflect = FALSE.

Seed

(integer) Starting seed for the random number generator; defaults to Seed = NULL. When no seed is specified by the user, the program will generate a random seed.

Value

  • loadings A common factor or bifactor loadings matrix.

  • Phi A factor correlation matrix.

  • urloadings The unrotated loadings matrix.

  • h2 A vector of item commonalities.

  • h2PopME A vector item commonalities that may include model approximation error.

  • Rpop The model-implied population correlation matrix.

  • RpopME The model-implied population correlation matrix with model error.

  • CovMatrices A list containing:

    • CovMajor The model implied covariances from the major factors.

    • CovMinor The model implied covariances from the minor factors.

    • CovUnique The model implied variances from the uniqueness factors.

    Bifactor A list containing:

    • loadingsHier Factor loadings of the 1st order solution of a hierarchical bifactor model.

    • PhiHier Factor correlations of the 1st order solution of a hierarchical bifactor model.

  • Scores A list containing:

    • FactorScores Factor scores for the common and uniqueness factors.

    • FacInd Factor indeterminacy indices for the error free population model.

    • FacIndME Factor score indeterminacy indices for the population model with model error.

    • ObservedScores A matrix of model implied ObservedScores. If Thresholds were supplied under Keyword FactorScores, ObservedScores will be transformed into Likert scores.

  • Monte A list containing output from the Monte Carlo simulations if generated.

  • IRTFactor loadings expressed in the normal ogive IRT metric. If Thresholds were given then IRT difficulty values will also be returned.

  • SeedThe initial seed for the random number generator.

  • callA copy of the function call.

  • cnA list of all active and nonactive function arguments.

Details

simFA was specifically designed to simplify the process of running Monte Carlo studies of factor analysis models. Thus, simFA can save all relevant output for a user-specified model. Saved output can be accessed by calling one or more of the following object names.

References

Schmid, J. and Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22(1), 53--61.

Tucker, L. R., Koopman, R. F., and Linn, R. L. (1969). Evaluation of factor analytic research procedures by means of simulated correlation matrices. Psychometrika, 34(4), 421--459.

Examples

Run this code
# NOT RUN {
#  Ex 1. Three Factor Simple Structure Model with Crossloadings and 
#  Ideal Nonsalient Loadings
   out <-  simFA(Seed = 1)
   print( round( out$loadings, 2 ) )

# Ex 2. Non Hierarchical bifactor model 3 group factors
# with constant loadings on the general factor
   out <- simFA(Bifactor = list(Bifactor = TRUE,
                                Hierarchical = FALSE,
                                F1FactorRange = c(.4, .4),
                                F1FactorDist = "runif"),
                Seed = 1)
   print( round( out$loadings, 2 ) ) 
   
# }

Run the code above in your browser using DataLab