Can standardize selected
variables in a lavaan model without
refitting the models, can handle
product term correctly and skip
categorical predictors in
standardization.
lav_betaselect(
object,
to_standardize = ".all.",
not_to_standardize = NULL,
skip_categorical_x = TRUE,
output = c("data.frame", "text"),
std_se = c("none", "delta", "bootstrap"),
std_z = TRUE,
std_pvalue = TRUE,
std_ci = TRUE,
level = 0.95,
progress = TRUE,
boot_out = NULL,
bootstrap = 100L,
store_boot_est = TRUE,
parallel = c("no", "snow", "multicore"),
ncpus = parallel::detectCores(logical = FALSE) - 1,
cl = NULL,
iseed = NULL,
find_product_terms = TRUE,
check_mean_centering = FALSE,
std_intercept = FALSE,
...,
delta_method = c("lavaan", "numDeriv"),
vector_form = TRUE
)A lav_betaselect-class object,
which is a data frame storing the parameter
estimates, similar in form to the
output of lavaan::parameterEstimates().
The output of
lavaan model fit functions, such
as lavaan::sem() and lavaan::cfa().
A string vector,
which should be the names of the
variables to be standardized.
Default is ".all.", indicating all
variables are to be standardized
(but see skip_categorical_x).
A string
vector, which should be the names
of the variables that should not be
standardized. This argument is useful
when most variables, except for a few,
are to be standardized. This argument
cannot be ued with to_standardize
at the same time. Default is NULL,
and only to_standardize is used.
Logical.
If TRUE, the default, all
categorical predictors, defined as
variables with only two possible
values in the data analyzed, will
be skipped in standardization. This
overrides the argument
to_standardize. That is, a
categorical predictor will not be
standardized even if listed in
to_standardize, unless users set
this argument to FALSE.
The format of the
output. Not used because the format
of the printout is now controlled
by the print-method of the output
of this function. Kept for backward
compatibility.
String. If set to "none",
the default, standard errors will not
be computed for the standardized
solution. If set to "delta",
delta method will be used to compute
the standard errors. If set to
"bootstrap", then what it does
depends whether boot_out is set.
If boot_out is to an output of
manymome::do_boot(), its content
will be used. If boot_out is
NULL and bootstrap
estimates are available in object
(e.g., bootstrapping is requested
when fitting the model in lavaan),
then the stored bootstrap estimates
will be sued. If not available,
the bootstrapping will be conducted
using lavaan::bootstrapLavaan(),
using arguments bootstrap,
parallel, ncpus, cl, and
iseed.`
Logical. If TRUE and
std_se is not set to "none",
standard error will be computed
using the method specified in
std_se. Default is TRUE.
Logical. If TRUE,
std_se is not set to "none",
and std_z is TRUE, p-values
will be computed using the method
specified in std_se. For
bootstrapping, the method proposed by
Asparouhov and Muthén (2021) is used.
Default is TRUE.
Logical. If TRUE and
std_se is not set to "none",
confidence intervals will be
computed using the method specified in
std_se. Default is FALSE.
The level of confidence of the confidence intervals. Default is .95. It will be used in the confidence intervals of both the unstandardized and standardized solution.
Logical. If TRUE,
progress bars will be displayed
for long process.
If std_se is
"bootstrap" and this argument
is set to an output of
manymome::do_boot(), its output
will be used in computing statistics
such as standard errors and
confidence intervals. This allows
users to use methods other than
bootstrapping when fitting the
model, while they can still request
bootstrapping for the standardized
solution.
If std_se is
"bootstrap" but bootstrapping is
not requested when fitting the model
and boot_out is not set,
lavaan::bootstrapLavaan() will be
called to do bootstrapping. This
argument is the number of bootstrap
samples to draw. Default is 100.
Should be set to 5000 or even 10000
for stable results.
Logical. If
std_se is "bootstrap" and this
argument is TRUE, the default,
the bootstrap estimates of the
standardized solution will be stored
in the attribute "boot_est". These
estimates can be used for
diagnosis of the bootstrapping. If
FALSE, then the bootstrap estimates
will not be stored.
If std_se is
"bootstrap" but bootstrapping is
not requested when fitting the model
and boot_out is not set,
lavaan::bootstrapLavaan() will be
called to do bootstrapping. This
argument is to be passed to
lavaan::bootstrapLavaan(). Default
is "no".
If std_se is
"bootstrap" but bootstrapping is
not requested when fitting the model
and boot_out is not set,
lavaan::bootstrapLavaan() will be
called to do bootstrapping. This
argument is to be passed to
lavaan::bootstrapLavaan(). Default
is parallel::detectCores(logical = FALSE) - 1.
Ignored if parallel is "no".
If std_se is
"bootstrap" but bootstrapping is
not requested when fitting the model
and boot_out is not set,
lavaan::bootstrapLavaan() will be
called to do bootstrapping. This
argument is to be passed to
lavaan::bootstrapLavaan(). Default
is NULL.
Ignored if parallel is "no".
If std_se is
"bootstrap" but bootstrapping is
not requested when fitting the model
and boot_out is not set,
lavaan::bootstrapLavaan() will be
called to do bootstrapping. This
argument is to be passed to
lavaan::bootstrapLavaan() to set
the seed for the random resampling.
Default
is NULL. Should be set to an integer
for reproducible results.
Ignored if parallel is "no".
String.
If it is certain that a model does
not have product terms, setting this
to FALSE will skip the search, which
is time consuming for a models with
many paths and/or many variables.
Default is TRUE, and the function
will automatically identify product
terms, if any.
Logical.
If TRUE, it will check whether
variables involved in a product term
has been mean-centered. If not,
an error will be raised.
Logical.
If TRUE, intercepts of y variables
will also be computed based on
the variables standardized.
Optional arguments to be
passed to the lavaan::parameterEstimates(),
which will be use to generate the
output.
The method used to compute delta-method standard errors. For internal use and should not be changed.
The internal method used to compute standardized solution. For internal use and should not be changed.
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448
This function lets users select which variables to be standardized when computing the standardized solution. It has the following features:
It automatically skips predictors which has only two unique values, assuming that they are dummy variables.
It does not standardize product term, which is incorrect. Instead, it computes the product term with its component variables standardized first.
It can be used to generate bootstrap confidence intervals for the standardized solution (Falk, 2018). Bootstrap confidence interval is better than doing standardization before fitting a model because it correctly takes into account the sampling variance of the standard deviations. It is also better than delta-method confidence interval because it takes into account the usually asymmetric distribution of parameters after standardization, such as standardized loadings and correlations.
For comparison, it can also report delta-method standard errors and confidence intervals if requested.
In most SEM programs, users have limited control on which variables to standardize when requesting the standardized solution. The solution may be uninterpretable or misleading in these conditions:
Dummy variables are standardized and their coefficients cannot be interpreted as the difference between two groups on the outcome variables.
Product terms (interaction terms) are standardized and they cannot be interpreted as the changes in the effects of focal variables when the moderators change (Cheung, Cheung, Lau, Hui, & Vong, 2022).
Variables with meaningful units can be more difficult to interpret when they are standardized (e.g., age).
Moreover, the delta method is usually used in standardization, which is suboptimal for standardization unless the sample size is large (Falk, 2018). For example, the covariance with variables standardized is a correlation, and its sampling distribution is skewed unless its population value is zero. However, delta-method confidence interval for the correlation is necessarily symmetric around the point estimate.
It only supports observed variable interaction terms, and only support two-way interactions.
It does not support multilevel models.
It only supports models fitted to raw data.
Asparouhov, A., & Muthén, B. (2021). Bootstrap p-value computation. Retrieved from https://www.statmodel.com/download/FAQ-Bootstrap%20-%20Pvalue.pdf
Cheung, S. F., Cheung, S.-H., Lau, E. Y. Y., Hui, C. H., & Vong, W. N. (2022) Improving an old way to measure moderation effect in standardized units. Health Psychology, 41(7), 502-505. tools:::Rd_expr_doi("10.1037/hea0001188")
Falk, C. F. (2018). Are robust standard errors the best approach for interval estimation with nonnormal data in structural equation modeling? Structural Equation Modeling: A Multidisciplinary Journal, 25(2) 244-266. tools:::Rd_expr_doi("10.1080/10705511.2017.1367254")
print.lav_betaselect() for its print method.
library(lavaan)
# Need to mean-center iv and mod
data_test_medmod$iv <- data_test_medmod$iv - mean(data_test_medmod$iv)
data_test_medmod$mod <- data_test_medmod$mod - mean(data_test_medmod$mod)
mod <-
"
med ~ iv + mod + iv:mod
dv ~ med + iv
"
fit <- sem(mod,
data_test_medmod,
fixed.x = TRUE)
summary(fit)
fit_beta <- lav_betaselect(fit,
to_standardize = c("iv", "dv"))
fit_beta
print(fit_beta, standardized_only = FALSE)
# In real studies:
# - should set bootstrap to at least 5000
# - should set parallel to "snow" or "multicore"
fit_beta_boot <- lav_betaselect(fit,
to_standardize = c("iv", "dv"),
std_se = "bootstrap",
std_ci = TRUE,
bootstrap = 100,
iseed = 1234)
fit_beta_boot
print(fit_beta_boot, standardized_only = FALSE)
# Print full results
print(fit_beta_boot,
standardized_only = FALSE)
Run the code above in your browser using DataLab