plausibleValues: Plausible-Values Imputation of Factor Scores Estimated from a lavaan Model

Description

Draw plausible values of factor scores estimated from a fitted lavaan model, then treat them as multiple imputations of missing data using runMI.

Usage

plausibleValues(object, nDraws = 20L, seed = 12345,
  omit.imps = c("no.conv", "no.se"), ...)

Arguments

object

A fitted model of class '>lavaan, blavaan, or '>lavaan.mi

nDraws

integer specifying the number of draws, analogous to the number of imputed data sets. If object is of class '>lavaan.mi, this will be the number of draws taken per imputation. Ignored if object is of class blavaan, in which case the number of draws is the number of MCMC samples from the posterior.

seed

integer passed to set.seed(). Ignored if object is of class blavaan,

omit.imps

character vector specifying criteria for omitting imputations when object is of class '>lavaan.mi. Can include any of c("no.conv", "no.se", "no.npd").

...

Optional arguments to pass to lavPredict. assemble will be ignored because multiple groups are always assembled into a single data.frame per draw. type will be ignored because it is set internally to type="lv".

Value

A list of length nDraws, each of which is a data.frame containing plausible values, which can be treated as a list of imputed data sets to be passed to runMI (see Examples). If object is of class '>lavaan.mi, the list will be of length nDraws*m, where m is the number of imputations.

Details

Because latent variables are unobserved, they can be considered as missing data, which can be imputed using Monte Carlo methods. This may be of interest to researchers with sample sizes too small to fit their complex structural models. Fitting a factor model as a first step, lavPredict provides factor-score estimates, which can be treated as observed values in a path analysis (Step 2). However, the resulting standard errors and test statistics could not be trusted because the Step-2 analysis would not take into account the uncertainty about the estimated factor scores. Using the asymptotic sampling covariance matrix of the factor scores provided by lavPredict, plausibleValues draws a set of nDraws imputations from the sampling distribution of each factor score, returning a list of data sets that can be treated like multiple imputations of incomplete data. If the data were already imputed to handle missing data, plausibleValues also accepts an object of class '>lavaan.mi, and will draw nDraws plausible values from each imputation. Step 2 would then take into account uncertainty about both missing values and factor scores. Bayesian methods can also be used to generate factor scores, as available with the blavaan package, in which case plausible values are simply saved parameters from the posterior distribution. See Asparouhov and Muthen (2010) for further technical details and references.

Each returned data.frame includes a case.idx column that indicates the corresponding rows in the data set to which the model was originally fitted (unless the user requests only Level-2 variables). This can be used to merge the plausible values with the original observed data, but users should note that including any new variables in a Step-2 model might not accurately account for their relationship(s) with factor scores because they were not accounted for in the Step-1 model from which factor scores were estimated.

If object is a multilevel lavaan model, users can request plausible values for latent variables at particular levels of analysis by setting the lavPredict argument level=1 or level=2. If the level argument is not passed via …, then both levels are returned in a single merged data set per draw. For multilevel models, each returned data.frame also includes a column indicating to which cluster each row belongs (unless the user requests only Level-2 variables).

References

Asparouhov, T. & Muthen, B. O. (2010). Plausible values for latent variables using Mplus. Technical Report. Retrieved from www.statmodel.com/download/Plausible.pdf

Examples

Run this code

# NOT RUN {
## example from ?cfa and ?lavPredict help pages
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
fs1 <- plausibleValues(fit1, nDraws = 3,
                       ## lavPredict() can add only the modeled data
                       append.data = TRUE)
lapply(fs1, head)

## To merge factor scores to original data.frame (not just modeled data)
fs1 <- plausibleValues(fit1, nDraws = 3)
idx <- lavInspect(fit1, "case.idx")      # row index for each case
if (is.list(idx)) idx <- do.call(c, idx) # for multigroup models
data(HolzingerSwineford1939)             # copy data to workspace
HolzingerSwineford1939$case.idx <- idx   # add row index as variable
## loop over draws to merge original data with factor scores
for (i in seq_along(fs1)) {
  fs1[[i]] <- merge(fs1[[i]], HolzingerSwineford1939, by = "case.idx")
}
lapply(fs1, head)


## multiple-group analysis, in 2 steps
step1 <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
            group.equal = c("loadings","intercepts"))
PV.list <- plausibleValues(step1)

## subsequent path analysis
path.model <- ' visual ~ c(t1, t2)*textual + c(s1, s2)*speed '
# }
# NOT RUN {
step2 <- sem.mi(path.model, data = PV.list, group = "school")
## test equivalence of both slopes across groups
lavTestWald.mi(step2, constraints = 't1 == t2 ; s1 == s2')
# }
# NOT RUN {

## multilevel example from ?Demo.twolevel help page
model <- '
  level: 1
    fw =~ y1 + y2 + y3
    fw ~ x1 + x2 + x3
  level: 2
    fb =~ y1 + y2 + y3
    fb ~ w1 + w2
'
msem <- sem(model, data = Demo.twolevel, cluster = "cluster")
mlPVs <- plausibleValues(msem, nDraws = 3) # both levels by default
lapply(mlPVs, head, n = 10)
## only Level 1
mlPV1 <- plausibleValues(msem, nDraws = 3, level = 1)
lapply(mlPV1, head)
## only Level 2
mlPV2 <- plausibleValues(msem, nDraws = 3, level = 2)
lapply(mlPV2, head)

# }

Run the code above in your browser using DataLab