item.invar: Between-Group and Longitudinal Measurement Invariance Evaluation

Description

This function evaluates configural, (threshold), metric, scalar, and strict between-group or longitudinal (partial) measurement invariance using confirmatory factor analysis with continuous or ordered categorical indicators by calling the cfa function in the R package lavaan. Measurement invariance evaluation for measurement models with ordered categorical indicators utilizes the Wu and Estabrook (2016) approach to model identification and constraints to investigate measurement invariance. By default, the function evaluates configural, metric, and scalar measurement invariance for measurement models with continuous indicators, while the function evaluates configural, threshold, metric, scalar, and strict measurement invariance for measurement models with ordered categorical indicators given at least four response categories for each indicator by providing a table with model fit information (i.e., chi-square test, fit indices based on a proper null model, and information criteria) and model comparison (i.e., chi-square difference test, change in fit indices, and change in information criteria). Additionally, variance-covariance coverage of the data, descriptive statistics, parameter estimates, modification indices, and residual correlation matrix can be requested by specifying the argument print.

Usage

item.invar(data, ..., model = NULL, group = NULL, long = FALSE, ordered = FALSE,
           parameterization = c("delta", "theta"), rescov = NULL, rescov.long = TRUE,
           cluster = NULL, invar = c("config", "thres", "metric", "scalar", "strict"),
           partial = NULL, ident = c("marker", "var", "effect"),
           estimator = c("ML", "MLM", "MLMV", "MLMVS", "MLF", "MLR",
                         "GLS", "WLS", "DWLS", "WLSM", "WLSMV",
                         "ULS", "ULSM", "ULSMV", "DLS", "PML"),
           missing = c("listwise", "pairwise", "fiml", "two.stage",
                       "robust.two.stage", "doubly.robust"), null.model = TRUE,
           print = c("all", "summary", "partial", "coverage", "descript", "fit",
                     "est", "modind", "resid"),
           print.fit = c("all", "standard", "scaled", "robust"),
           mod.minval = 6.63, resid.minval = 0.1, lavaan.run = TRUE,
           digits = 3, p.digits = 3, as.na = NULL, write = NULL, append = TRUE,
           check = TRUE, output = TRUE)

Value

Returns an object of class misty.object, which is a list with following entries:

call: function call
type: type of analysis
data: data frame including all variables used in the analysis, i.e., indicators for the factor, grouping variable and cluster variable
args: specification of function arguments
model: list with specified model for the for the configural (config), threshold (thresh), metric (metric), scalar (scalar), and strict invariance model (strict)
model.fit: list with fitted lavaan object of the configural, metric, scalar, and strict invariance model
check: list with the results of the convergence and model identification check for the configural (config), threshold (thresh), metric (metric), scalar (scalar), and strict invariance model (strict)
result: list with result tables, i.e., summary for the summary of the specification, e.g., estimation method or missing data handling in lavaan, partial for the summary of the partial invariance specification,coverage for the variance-covariance coverage of the data, descript list with descriptive statistics (stat) and frequencies (freq), fit for a list with model fit based on standard, scaled, and robust fit indices, param for a list with parameter estimates for the configural, metric, scalar, and strict invariance model, modind for the list with modification indices for the configural, metric, scalar, and strict invariance model, score for the list with result of the score tests for constrained parameters for the threshold, metric, scalar, and strict invariance model, and resid for the list with residual correlation matrices and standardized residual means for the configural, threshold, metric, scalar, and strict invariance model

Arguments

data: a data frame. If model = NULL, confirmatory factor analysis based on a measurement model with one factor labeled f comprising all variables in the data frame specified in x for evaluating between-group measurement invariance for the grouping variable specified in the argument group is conducted. Longitudinal measurement invariance evaluation can only be conducted by specifying the model using the argument model. Note that the cluster variable is excluded from x when specifying cluster. If model is specified, the data frame needs to contain all variables used in the argument model and the cluster variable when specifying the name of the cluster variable in the argument cluster.
...: an expression indicating the variable names in data, e.g., item.invar(dat, x1, x2, x2, group = "group"). Note that the operators +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.
model: a character vector specifying a measurement model with one factor, or a list of character vectors for specifying a measurement model with more than one factor for evaluating between-group measurement invariance when long = FALSE or a list of character vectors for specifying a measurement model with one factor for each time of measurement for evaluating longitudinal measurement invariance when specifying long = TRUE. For example, model = c("x1", "x2", "x3", "x4") for specifying a measurement model with one factor labeled f comprising four indicators, or model = list(factor1 = c("x1", "x2", "x3", "x4"), factor2 = c("x5", "x6", "x7", "x8")) for specifying a measurement model with two latent factors labeled factor1 and factor2 each comprising four indicators for evaluating between-group measurement invariance, or model = list(time1 = c("ax1", "ax2", "ax3", "ax4"), time2 = c("bx1", "bx2", "bx3", "bx4"), time3 = c("cx1", "cx2", "cx3", "cx4")) for specifying a longitudinal measurement model with three time points comprising four indicators at each time point. This function cannot evaluate longitudinal measurement invariance for a measurement model with more than one factor. Note that the name of each list element is used to label factors, i.e., all list elements need to be named, otherwise factors are labeled with "f1", "f2", "f3" when long = FALSE and with "t1", "t2", "t3" when long = TRUE and so on.
group: either a character string indicating the variable name of the grouping variable in the data frame specified in x or a vector representing the groups for conducting multiple-group analysis to evaluate between-group measurement invariance.
long: logical: if TRUE, longitudinal measurement invariance evaluation is conducted. The longitudinal measurement model is specified by using the argument model. Note that this function can only deal with a measurement model with one factor at each time point when investigating longitudinal measurement invariance. Moreover, this function can only evaluate either between-group or longitudinal measurement invariance, but not both at the same time.
ordered: logical: if TRUE, all indicator variables of the measurement model are treated as ordered categorical variables, i.e., measurement invariance evaluation utilizes the Wu and Estabrook (2016) approach to model identification and constraints for investigating measurement invariance. Note that all indicators variables need to have the same number of response categories, either two (binary), three (ternary), or more than three response categories. Accordingly, zero cell counts are not allowed, e.g., zero observations for a response category of an indicator within a group when investigating between-group measurement invariance or zero observations for a response category of an indicator at a time point when investigating longitudinal measurement invariance.
parameterization: a character string only used when treating indicators of the measurement model as ordered categorical (ordinal = TRUE), i.e., "delta" (default) for delta parameterization or "theta" for theta parameterization.
rescov: a character vector or a list of character vectors for specifying residual covariances, e.g., rescov = c("x1", "x2") for specifying a residual covariance between items x1 and x2, or rescov = list(c("x1", "x2"), c("x3", "x4")) for specifying residual covariances between items x1 and x2, and items x3 and x4.
rescov.long: logical: if TRUE (default), residual covariances between parallel indicators are estimated across time when evaluating longitudinal measurement invariance (long = TRUE), i.e., residual variances of the same indicators that are measured at different time points are correlated across all possible time points. Note that residual covariances should be estimated even if the parameter estimates are statistically not significant since indicator-specific systematic variance is likely to correlate with itself over time (Little, 2013, p. 164).
cluster: either a character string indicating the variable name of the cluster variable in data, or a vector representing the nested grouping structure (i.e., group or cluster variable) for computing scaled chi-square test statistic that takes into account non-independence of observations. Note that this option is not available when evaluating measurement invariance for ordered categorical indicators by specifying ordered = TRUE).
invar: a character string indicating the level of measurement invariance to be evaluated, i.e., config to evaluate configural measurement invariance (i.e., same factor structure across groups or time), thres to evaluate configural, and threshold measurement invariance (i.e., equal item-specific threshold parameters across group or time), metric to evaluate configural, threshold and metric measurement invariance (i.e., equal factor loadings across groups or time), scalar (default when ordered = FALSE) to evaluate configural, threshold, metric and scalar measurement invariance (i.e., equal intercepts across groups or time), and strict (default when ordered = TRUE) to evaluate configural, threshold, metric, scalar, and strict measurement invariance (i.e., equal residual variances or scaling factors across groups or time). Note that threshold measurement invariance is only available when evaluating measurement invariance for ordered categorical indicators. In this case, threshold measurement invariance can only be investigated when all indicators have at least four response categories. In addition, metric measurement invariance cannot be investigated when all indicators have only two response categories, i.e., binary indicators.
partial: a list of character vectors named load for freeing factor loadings, inter for freeing intercepts, and/or resid for freeing residual variances when evaluating between-group measurement invariance based on two groups (see Example 4a) or longitudinal measurement invariance (see Example 11a and 11b). When evaluating between-group measurement invariance based on more than two groups, a list with lists named with e.g., in case of three groups g1 for group 1, g2 for group 2, and/or g3 for group 3 with these lists containing character vectors named load for freeing factor loadings, inter for freeing intercepts, and/or resid for freeing residual variances in specific groups. Note that at least two invariant indicators per latent variable are needed for a partial measurement invariance model. Otherwise there might be issues with model non-identification.
ident: a character string indicating the method used for identifying and scaling latent variables, i.e., "marker" for the marker variable method fixing the first factor loading of the latent variable to 1 and fixing the first intercept to 0, "var" (default) for the fixed variance method fixing the variance of the latent variable to 1 and the latent mean to 0, or "effect" for the effects-coding method using equality constraints so that the average of the factor loading of the latent variable equals 1 and the sum of intercepts equals 0. Note that measurement invariance evaluation for ordered categorical indicators can only be conducted based on the fixed variance method ("var").
estimator: a character string indicating the estimator to be used (see 'Details' in the help page of the item.cfa() function). By default, "MLR" is used for CFA models with continuous indicators and "WLSMV" is used for CFA models with ordered categorical indicators. Note that the estimators "ML", "MLM", "MLMV", "MLMVS", "MLF" and "MLR" are not available when ordered = TRUE.
missing: a character string indicating how to deal with missing data, i.e., "listwise" for listwise deletion, "pairwise" for pairwise deletion, "fiml" for full information maximum likelihood method, "two.stage" for two-stage maximum likelihood method, "robust.two.stage" for robust two-stage maximum likelihood method, and "doubly-robust" for doubly-robust method (see 'Details' in the help page of theitem.cfa() function). By default, "fiml" is used for CFA models with continuous indicators and "listwise" is used for CFA models with ordered categorical indicators given that "fiml" is not available for a limited-information estimator used to estimate the CFA model with ordered categorical indicators. Note that the argument missing switches to listwise when the data set is complete. Also note that the robust CFI, TLI, and RMSEA are different in complete data depending on whether FIML or listwise deletion was specified when estimating the model in lavaan.
null.model: logical: if TRUE (default), the proper null model for computing incremental fit indices (i.e., CFI and TLI) is used, i.e., means and variances of the indicators are constrained to be equal across group or time in the null model (Little, 2013, p. 112). Note that the function does not provide the proper null model specification when evaluating measurement invariance for ordered categorical indicators i.e., the argument will switch to FALSE when specifying ordered = TRUE).
print: a character string or character vector indicating which results to show on the console, i.e. "all" for all results, "summary" for a summary of the specification (e.g., estimation and optimization method, test statistic, missing data handling, and identification method), "partial" for a summary of the partial measurement invariance specification listing parameters that are freely estimated when partial is not NULL, "coverage" for the variance-covariance coverage of the data, "descript" for descriptive statistics for continuous variables (ordered = FALSE) and item frequencies for ordered categorical variable (ordered = TRUE), "fit" for model fit and model comparison, "est" for parameter estimates, "modind" for modification indices, and "resid" for the residual correlation matrix and standardized residual means. By default, a summary of the specification, model fit, and parameter estimates are printed. Note that parameter estimates, modification indices, and residual correlation matrix is only provided for the model investigating the level of measurement invariance specified in the argument "invar".
print.fit: a character string or character vector indicating which version of the CFI, TLI, and RMSEA to show on the console when using a robust estimation method involving a scaling correction factor, i.e., "all" for all versions of the CFI, TLI, and RMSEA, "standard" (default when estimator is one of "ML", "MLF", "GLS", "WLS", "DWLS", "ULS", "PML") for fit indices without any non-normality correction, "scaled" (default when ordered = TRUE) for population-corrected robust fit indices with ad hoc non-normality correction, and robust (default when estimator is one of "MLM", "MLMV", "MLMVS", "MLR", "WLSM", "WLSMV", "ULSM", "ULSMV", "DLS") for sample-corrected robust fit indices based on formula provided by Li and Bentler (2006) and Brosseau-Liard and Savalei (2014).
mod.minval: numeric value to filter modification indices and only show modifications with a modification index value equal or higher than this minimum value. By default, modification indices equal or higher 6.63 are printed. Note that a modification index value of 6.63 is equivalent to a significance level of \(\alpha = .01\).
resid.minval: numeric value indicating the minimum absolute residual correlation coefficients and standardized means to highlight in boldface. By default, absolute residual correlation coefficients and standardized means equal or higher 0.1 are highlighted. Note that highlighting can be disabled by setting the minimum value to 1.
lavaan.run: logical: if TRUE (default), all models for evaluating measurement invariance will be estimated by using the cfa() function from the R package lavaan.
digits: an integer value indicating the number of decimal places to be used for displaying results. Note that information criteria and chi-square test statistic are printed with digits minus 1 decimal places.
p.digits: an integer value indicating the number of decimal places to be used for displaying p-values, covariance coverage (i.e., p.digits - 1), and residual correlation coefficients.
as.na: a numeric vector indicating user-defined missing values, i.e., these values are converted to NA before conducting the analysis. Note that as.na() function is only applied to x but not to group or cluster.
write: a character string naming a file for writing the output into either a text file with file extension ".txt" (e.g., "Output.txt") or Excel file with file extension ".xlsx" (e.g., "Output.xlsx"). If the file name does not contain any file extension, an Excel file will be written.
append: logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.
check: logical: if TRUE (default), argument specification is checked and convergence and model identification checks are conducted for all estimated models.
output: logical: if TRUE (default), output is shown.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

References

Brosseau-Liard, P. E., & Savalei, V. (2014) Adjusting incremental fit indices for nonnormality. Multivariate Behavioral Research, 49, 460-470. https://doi.org/10.1080/00273171.2014.933697

Li, L., & Bentler, P. M. (2006). Robust statistical tests for evaluating the hypothesis of close fit of misspecified mean and covariance structural models. UCLA Statistics Preprint #506. University of California.

Little, T. D. (2013). Longitudinal structural equation modeling. Guilford Press.

Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48, 1-36. https://doi.org/10.18637/jss.v048.i02

Wu, H., & Estabrook, R. (2016). Identification of confirmatory factor analysis models of different levels of invariance for ordered categorical outcomes. Psychometrika, 81(4), 1014–1045. doi:10.1007/s11336-016-9506-0

Examples

Run this code

if (FALSE) {
# Load data set "HolzingerSwineford1939" in the lavaan package
data("HolzingerSwineford1939", package = "lavaan")

#----------------------------------------------------------------------------
# Between-Group Measurement Invariance: Continuous Indicators

#..................
# Measurement model with one factor

# Example 1a: Model specification using the argument '...'
item.invar(HolzingerSwineford1939, x1, x2, x3, x4, group = "sex")

# Example 1b: Alternative model specification without using the argument '...'
item.invar(HolzingerSwineford1939[, c("x1", "x2", "x3", "x4")],
           group = HolzingerSwineford1939$sex)

# Example 1c: Alternative model specification without using the argument '...'
item.invar(HolzingerSwineford1939[, c("x1", "x2", "x3", "x4", "sex")], group = "sex")

# Example 1d: Alternative model specification using the argument 'model'
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"), group = "sex")

#..................
# Measurement model with two factors

# Example 2: Model specification using the argument 'model'
item.invar(HolzingerSwineford1939,
           model = list(c("x1", "x2", "x3", "x4"), c("x5", "x6", "x7", "x8")),
           group = "sex")

#..................
# Configural, metric, scalar, and strict measurement invariance

# Example 3: Evaluate configural, metric, scalar, and strict measurement invariance
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", invar = "strict")

#..................
# Between-group partial measurement invariance

# Example 4a: Two Groups
#             Free factor loadings for 'x2' and 'x3'
#             Free intercept for 'x1'
#             Free residual variance for 'x4'
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", invar = "strict",
           partial = list(load = c("x2", "x3"),
                          inter = "x1",
                          resid = "x4"))

# Example 4b: More than Two Groups
#             Free factor loading for 'x2' in group 2
#             Free factor loading for 'x4' in group 1 and 4
#             Free intercept for 'x1' in group 3
#             Free residual variance for 'x3' in group 1 and 3
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "ageyr", invar = "strict",
           partial = list(load = list(x2 = "g2", x4 = c("g1", "g4")),
                          inter = list(x1 = "g3"),
                          resid = list(x3 = c("g1", "g3"))))

#..................
# Residual covariances

# Example 5a: One residual covariance
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           rescov = c("x3", "x4"), group = "sex")

# Example 5b: Two residual covariances
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           rescov = list(c("x1", "x4"), c("x3", "x4")), group = "sex")

#..................
# Scaled test statistic

# Example 6a: Specify cluster variable using a variable name in 'data'
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", cluster = "agemo")

# Example 6b: Specify cluster variable as vector
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", cluster = HolzingerSwineford1939$agemo)

#..................
# Default Null model

# Example 7: Specify default null model for computing incremental fit indices
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", null.model = FALSE)

#..................
# Print argument

# Example 8a: Request all results
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", print = "all")

# Example 8b: Request fit indices with ad hoc non-normality correction
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", print.fit = "scaled")

# Example 8c: Request modification indices with value equal or higher than 2
# and highlight residual correlations equal or higher than 0.3
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", print = c("modind", "resid"),
           mod.minval = 2, resid.minval = 0.3)

#..................
# Model syntax and lavaan summary of the estimated model

# Example 9a: Model specification using the argument '...'
mod1 <- item.invar(HolzingerSwineford1939, x1, x2, x3, x4, group = "sex",
                   output = FALSE)

# lavaan summary of the scalar invariance model
lavaan::summary(mod1$model.fit$scalar, standardized = TRUE, fit.measures = TRUE)

# Example 9b: Do not estimate any models
mod2 <- item.invar(HolzingerSwineford1939, x1, x2, x3, x4, group = "sex",
                   lavaan.run = FALSE)

# lavaan model syntax metric invariance model
cat(mod2$model$metric)

# lavaan model syntax scalar invariance model
cat(mod2$model$scalar)

#----------------------------------------------------------------------------
# Longitudinal Measurement Invariance: Continuous Indicators

# Example 10: Two time points with three indicators at each time point
item.invar(HolzingerSwineford1939,
           model = list(c("x1", "x2", "x3"), c("x5", "x6", "x7")), long = TRUE)

#..................
# Longitudinal partial measurement invariance

# Example 11: Two Time Points with three indicators at each time point
#             Free factor loading for 'x2'
#             Free intercepts for 'x1' and x2
item.invar(HolzingerSwineford1939,
           model = list(c("x1", "x2", "x3"), c("x5", "x6", "x7")), long = TRUE,
           partial = list(load = "x2",
                          inter = c("x1", "x2")))

#----------------------------------------------------------------------------
# Between-Group Measurement Invariance: Ordered Categorical Indicators
#
# Note that the example analysis for ordered categorical indicators cannot be
# conduct since the data set 'data' is not available.

# Example 12a: Delta parameterization (default)
item.invar(data, item1, item2, item3, item4, group = "two.group", ordered = TRUE)

# Example 12a: Theta parameterization
item.invar(data, item1, item2, item3, item4, group = "two.group", ordered = TRUE,
           parameterization = "theta")

#----------------------------------------------------------------------------
# Between-Group Partial Measurement Invariance: Ordered Categorical Indicators

# Example 13a: Two Groups
#              Free 2nd and 4th threshold of 'item1'
#              Free 1st threshold of 'item3'
#              Free factor loadings for 'item2' and 'item4'
#              Free intercept for 'item1'
#              Free residual variance for 'item3'
item.invar(data, item1, item2, item3, item4, group = "two.group", ordered = TRUE,
           partial = list(thres = list(item1 = c("t2", "t4"),
                                       item3 = "t1"),
                          load = c("item2", "item4"),
                          inter = "item1",
                          resid = "item3"))

# Example 13b: More than Two Groups
#              Free 1st threshold of 'item1' in group 1 and 2
#              Free 3rd threshold of 'item3' in group 3
#              Free factor loadings for 'item2' in group 1
#              Free intercept for 'item2' in group 1
#              Free intercept for 'item3' in group 2 and 4
#              Free residual variance for 'item1' in group 1 and 3
item.invar(data, item1, item2, item3, item4, group = "four.group", ordered = TRUE,
           partial = list(thres = list(item1 = list(t1 = c("g1", "g2")),
                                       item3 = list(t3 = "g3")),
                          load  = list(item2 = "g1"),
                          inter = list(item2 = "g1", item3 = c("g2", "g4")),
                          resid = list(item1 = c("g1", "g3"))))

#----------------------------------------------------------------------------
# Longitudinal Measurement Invariance: Ordered Categorical Indicators

# Example 14: Two Time Points
item.invar(data, model = list(c("aitem1", "aitem2", "aitem3"),
                              c("bitem1", "bitem2", "bitem3")),
           long = TRUE, ordered = TRUE)

#..................
# Longitudinal partial measurement invariance: Ordered Categorical Indicators

# Example 15: Two Time Points
#             Free 2nd and 4th threshold of 'aitem1'
#             Free 1st threshold of 'aitem4'
#             Free factor loading for 'aitem2
#             Free intercepts for 'aitem1' and 'bitem2'
#             Free residual variance for 'aitem3'
item.invar(data, model = list(c("aitem1", "aitem2", "aitem3"),
                              c("bitem1", "bitem2", "bitem3")),
           long = TRUE, ordered = TRUE, invar = "strict",
           partial = list(thres = list(aitem1 = c("t2", "t4"), aitem3 = "t1"),
                          load = "aitem2",
                          inter = c("aitem1", "bitem2"),
                          resid = "aitem3"))

#------------------------------------------------
# Write Results

# Example 16a: Write Results into a text file
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", print = "all", write = "Invariance.txt", output = FALSE)

# Example 16b: Write Results into a Excel file
item.invar(HolzingerSwineford1939, model = c("x1", "x2", "x3", "x4"),
           group = "sex", print = "all", write = "Invariance.xlsx", output = FALSE)
}

Run the code above in your browser using DataLab