Reruns a lavaan
analysis several
times, each time with one case removed.
lavaan_rerun(
fit,
case_id = NULL,
to_rerun,
md_top,
resid_md_top,
allow_inadmissible = FALSE,
skip_all_checks = FALSE,
parallel = FALSE,
ncores = NULL,
makeCluster_args = list(spec = getOption("cl.cores", 2)),
progress = TRUE,
rerun_method = c("lavaan", "update")
)
A lavaan_rerun
-class object, which is a list with the following elements:
rerun
: The n lavaan
output objects.
fit
: The original output from lavaan
.
post_check
: A list of length equals to n. Each analysis was
checked by lavaan::lavTech(x, "post.check")
, x
being the
lavaan
results. The results of this test are stored in this
list. If the value is TRUE
, the estimation converged and the
solution is admissible. If not TRUE
, it is a warning message
issued by lavaan::lavTech()
.
converged
: A vector of length equals to n. Each analysis was
checked by lavaan::lavTech(x, "converged")
, x
being the
lavaan
results. The results of this test are stored in this
vector. If the value is TRUE
, the estimation converged. If
not TRUE
, then the estimation failed to converge if the corresponding
case is excluded.
call
: The call to lavaan_rerun()
.
selected
: A numeric vector of the row numbers of cases selected
in the analysis. Its length should be equal to the length of
rerun
.
The output from lavaan::lavaan()
or its wrappers (e.g.,
lavaan::cfa()
and lavaan::sem()
).
If it is a character vector of length equals to the
number of cases (the number of rows in the data in fit
), then it
is the vector of case identification values. If it is NULL
, the
default, then case.idx
used by lavaan
functions will be used
as case identification values. The case identification values will
be used to name the list of n output.
The cases to be processed. If case_id
is
specified, this should be a subset of case_id
. If case_id
is
not specified, then this should be a vector of integers indicating
the rows to te processed, as appeared in the data in fit
.
to_rerun
cannot be used together with md_top
or
resid_md_top.
The number of cases to be processed based on the
Mahalanobis distance computed on all observed variables used in
the model. The cases will be ranked from the largest to the
smallest distance, and the top md_top
case(s) will be processed.
md_top
cannot be used together with to_rerun
or
resid_md_top.
The number of cases to be processed based on
the Mahalanobis distance computed from the residuals of outcome
variables. The cases will be ranked from the largest to the
smallest distance, and the top resid_md_top
case(s) will be
processed. resid_md_top
cannot be used together with to_rerun
or md_top.
If TRUE
, accepts a fit object with
inadmissible results (i.e., post.check
from
lavaan::lavInspect()
is FALSE
). Default is FALSE
.
If TRUE
, skips all checks and allow
users to run this function on any object of lavaan
class.
For users to experiment this and other functions on models
not officially supported. Default is FALSE
.
Whether parallel will be used. If TRUE
, will use
functions in the parallel
package to rerun the analysis.
Currently, only support "snow"
type clusters using local CPU
cores. Default is FALSE
.
The number of CPU cores
to use if parallel processing is
requested. Default is NULL
, and the
number of cores is determine by
makeCluster_args
. If set to an
integer, this number will override
the setting (spec
) in
makeCluster_args
.
A named list of arguments to be passed to
parallel::makeCluster()
. Default is list(spec = getOption("cl.cores", 2)))
. If only the number of cores need to
be specified, use list(spec = x)
, where x
is the number
of cores to use. Alternatively, set ncores
and its
value will be used in spec
.
If TRUE
, the default,
progress will be displayed on screen.
How fit will be rerun. Default is
"lavaan"
. An alternative method is "update"
. For
internal use. If "lavaan"
returns an error, try setting
this argument to "update"
.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448.
lavaan_rerun()
gets an lavaan::lavaan()
output and
reruns the analysis n0 times, using the same arguments and
options in the output, n0 equals to the number of cases selected,
by default all cases in the analysis. In each
run, one case will be removed.
Optionally, users can rerun the analysis with only selected cases
removed. These cases can be specified by case IDs, by Mahalanobis
distance computed from all variables used in the model, or by
Mahalanobis distance computed from the residuals (observed score -
implied scores) of observed outcome variables. See the help on the
arguments to_rerun
, md_top
, and resid_md_top
.
It is not recommended to use Mahalanobis distance computed from all variables, especially for models with observed variables as predictors (Pek & MacCallum, 2011). Cases that are extreme on predictors may not be influential on the parameter estimates. Nevertheless, this distance is reported in some SEM programs and so this option is provided.
Mahalanobis distance based on residuals are supported for models
with no latent factors. The implied scores are computed by
implied_scores()
.
If the sample size is large, it is recommended to use parallel
processing. However, it is possible that parallel
processing will fail. If this is the case, try to use serial
processing, by simply removing the argument parallel
or set it to
FALSE
.
Many other functions in semfindr use the output from
lavaan_rerun()
. Instead of running the n analyses every time, do
this step once and then users can compute whatever influence
statistics they want quickly.
If the analysis took a few minutes to run due to the large number
of cases or the long processing time in fitting the model, it is
recommended to save the output to an external file (e.g., by
base::saveRDS()
).
Supports both single-group and multiple-group models. (Support for multiple-group models available in 0.1.4.8 and later version).
library(lavaan)
dat <- pa_dat
# For illustration, select only the first 50 cases
dat <- dat[1:50, ]
# The model
mod <-
"
m1 ~ iv1 + iv2
dv ~ m1
"
# Fit the model
fit <- lavaan::sem(mod, dat)
summary(fit)
# Fit the model n times. Each time with one case removed.
fit_rerun <- lavaan_rerun(fit, parallel = FALSE)
# Print the output for a brief description of the runs
fit_rerun
# Results excluding the first case
fitMeasures(fit_rerun$rerun[[1]], c("chisq", "cfi", "tli", "rmsea"))
# Results by manually excluding the first case
fit_01 <- lavaan::sem(mod, dat[-1, ])
fitMeasures(fit_01, c("chisq", "cfi", "tli", "rmsea"))
Run the code above in your browser using DataLab