lavaan: Fit a Latent Variable Model

Description

Fit a latent variable model.

Usage

lavaan(model = NULL, data = NULL, model.type = "sem", meanstructure = "default", int.ov.free = FALSE, int.lv.free = FALSE,  conditional.x = "default", fixed.x = "default", orthogonal = FALSE, std.lv = FALSE,  parameterization = "default", auto.fix.first = FALSE, auto.fix.single = FALSE, auto.var = FALSE, auto.cov.lv.x = FALSE, auto.cov.y = FALSE, auto.th = FALSE, auto.delta = FALSE, std.ov = FALSE, missing = "default", ordered = NULL, sample.cov = NULL, sample.cov.rescale = "default", sample.mean = NULL, sample.nobs = NULL, ridge = 1e-05, group = NULL, group.label = NULL, group.equal = "", group.partial = "", group.w.free = FALSE, cluster = NULL,  constraints = "", estimator = "default", likelihood = "default", link = "default", information = "default", se = "default", test = "default", bootstrap = 1000L, mimic = "default", representation = "default", do.fit = TRUE, control = list(), WLS.V = NULL, NACOV = NULL,  zero.add = "default", zero.keep.margins = "default",  zero.cell.warn = TRUE, start = "default",  slotOptions = NULL, slotParTable = NULL, slotSampleStats = NULL, slotData = NULL, slotModel = NULL, slotCache = NULL, check = c("start", "post"), verbose = FALSE, warn = TRUE, debug = FALSE)

Arguments

model

A description of the user-specified model. Typically, the model is described using the lavaan model syntax. See model.syntax for more information. Alternatively, a parameter table (eg. the output of the lavaanify() function) is also accepted.

data

An optional data frame containing the observed variables used in the model. If some variables are declared as ordered factors, lavaan will treat them as ordinal variables.

model.type

Set the model type: possible values are "cfa", "sem" or "growth". This may affect how starting values are computed, and may be used to alter the terminology used in the summary output, or the layout of path diagrams that are based on a fitted lavaan object.

meanstructure

If TRUE, the means of the observed variables enter the model. If "default", the value is set based on the user-specified model, and/or the values of other arguments.

int.ov.free

If FALSE, the intercepts of the observed variables are fixed to zero.

int.lv.free

If FALSE, the intercepts of the latent variables are fixed to zero.

conditional.x

If TRUE, we set up the model conditional on the exogenous `x' covariates; the model-implied sample statistics only include the non-x variables. If FALSE, the exogenous `x' variables are modeled jointly with the other variables, and the model-implied statistics refect both sets of variables. If "default", the value is set depending on the estimator, and whether or not the model involves categorical endogenous variables.

fixed.x

If TRUE, the exogenous `x' covariates are considered fixed variables and the means, variances and covariances of these variables are fixed to their sample values. If FALSE, they are considered random, and the means, variances and covariances are free parameters. If "default", the value is set depending on the mimic option.

orthogonal

If TRUE, the exogenous latent variables are assumed to be uncorrelated.

std.lv

If TRUE, the metric of each latent variable is determined by fixing their (residual) variances to 1.0. If FALSE, the metric of each latent variable is determined by fixing the factor loading of the first indicator to 1.0.

parameterization

Currently only used if data is categorical. If "delta", the delta parameterization is used. If "theta", the theta parameterization is used.

auto.fix.first

If TRUE, the factor loading of the first indicator is set to 1.0 for every latent variable.

auto.fix.single

If TRUE, the residual variance (if included) of an observed indicator is set to zero if it is the only indicator of a latent variable.

auto.var

If TRUE, the residual variances and the variances of exogenous latent variables are included in the model and set free.

auto.cov.lv.x

If TRUE, the covariances of exogenous latent variables are included in the model and set free.

auto.cov.y

If TRUE, the covariances of dependent variables (both observed and latent) are included in the model and set free.

auto.th

If TRUE, thresholds for limited dependent variables are included in the model and set free.

auto.delta

If TRUE, response scaling parameters for limited dependent variables are included in the model and set free.

std.ov

If TRUE, all observed variables are standardized before entering the analysis.

missing

If "listwise", cases with missing values are removed listwise from the data frame before analysis. If "direct" or "ml" or "fiml" and the estimator is maximum likelihood, Full Information Maximum Likelihood (FIML) estimation is used using all available data in the data frame. This is only valid if the data are missing completely at random (MCAR) or missing at random (MAR). If "default", the value is set depending on the estimator and the mimic option.

ordered

Character vector. Only used if the data is in a data.frame. Treat these variables as ordered (ordinal) variables, if they are endogenous in the model. Importantly, all other variables will be treated as numeric (unless they are declared as ordered in the original data.frame.)

sample.cov

Numeric matrix. A sample variance-covariance matrix. The rownames and/or colnames must contain the observed variable names. For a multiple group analysis, a list with a variance-covariance matrix for each group. Note that if maximum likelihood estimation is used and likelihood="normal", the user provided covariance matrix is internally rescaled by multiplying it with a factor (N-1)/N, to ensure that the covariance matrix has been divided by N. This can be turned off by setting the sample.cov.rescale argument to FALSE.

sample.cov.rescale

If TRUE, the sample covariance matrix provided by the user is internally rescaled by multiplying it with a factor (N-1)/N. If "default", the value is set depending on the estimator and the likelihood option: it is set to TRUE if maximum likelihood estimation is used and likelihood="normal", and FALSE otherwise.

sample.mean

A sample mean vector. For a multiple group analysis, a list with a mean vector for each group.

sample.nobs

Number of observations if the full data frame is missing and only sample moments are given. For a multiple group analysis, a list or a vector with the number of observations for each group.

ridge

Numeric. Small constant used for ridging. Only used if the sample covariance matrix is non positive definite.

group

A variable name in the data frame defining the groups in a multiple group analysis.

group.label

A character vector. The user can specify which group (or factor) levels need to be selected from the grouping variable, and in which order. If missing, all grouping levels are selected, in the order as they appear in the data.

group.equal

A vector of character strings. Only used in a multiple group analysis. Can be one or more of the following: "loadings", "intercepts", "means","thresholds", "regressions", "residuals", "residual.covariances", "lv.variances" or "lv.covariances", specifying the pattern of equality constraints across multiple groups.

group.partial

A vector of character strings containing the labels of the parameters which should be free in all groups (thereby overriding the group.equal argument for some specific parameters).

group.w.free

Logical. If TRUE, the group frequencies are considered to be free parameters in the model. In this case, a Poisson model is fitted to estimate the group frequencies. If FALSE (the default), the group frequencies are fixed to their observed values.

cluster

Not used yet.

constraints

Additional (in)equality constraints not yet included in the model syntax. See model.syntax for more information.

estimator

The estimator to be used. Can be one of the following: "ML" for maximum likelihood, "GLS" for generalized least squares, "WLS" for weighted least squares (sometimes called ADF estimation), "ULS" for unweighted least squares and "DWLS" for diagonally weighted least squares. These are the main options that affect the estimation. For convenience, the "ML" option can be extended as "MLM", "MLMV", "MLMVS", "MLF", and "MLR". The estimation will still be plain "ML", but now with robust standard errors and a robust (scaled) test statistic. For "MLM", "MLMV", "MLMVS", classic robust standard errors are used (se="robust.sem"); for "MLF", standard errors are based on first-order derivatives (se="first.order"); for "MLR", `Huber-White' robust standard errors are used (se="robust.huber.white"). In addition, "MLM" will compute a Satorra-Bentler scaled (mean adjusted) test statistic (test="satorra.bentler") , "MLMVS" will compute a mean and variance adjusted test statistic (Satterthwaite style) (test="mean.var.adjusted"), "MLMV" will compute a mean and variance adjusted test statistic (scaled and shifted) (test="scaled.shifted"), and "MLR" will compute a test statistic which is asymptotically equivalent to the Yuan-Bentler T2-star test statistic. Analogously, the estimators "WLSM" and "WLSMV" imply the "DWLS" estimator (not the "WLS" estimator) with robust standard errors and a mean or mean and variance adjusted test statistic. Estimators "ULSM" and "ULSMV" imply the "ULS" estimator with robust standard errors and a mean or mean and variance adjusted test statistic.

likelihood

Only relevant for ML estimation. If "wishart", the wishart likelihood approach is used. In this approach, the covariance matrix has been divided by N-1, and both standard errors and test statistics are based on N-1. If "normal", the normal likelihood approach is used. Here, the covariance matrix has been divided by N, and both standard errors and test statistics are based on N. If "default", it depends on the mimic option: if mimic="lavaan" or mimic="Mplus", normal likelihood is used; otherwise, wishart likelihood is used.

link

Currently only used if estimator is MML. If "logit", a logit link is used for binary and ordered observed variables. If "probit", a probit link is used. If "default", it is currently set to "probit" (but this may change).

information

If "expected", the expected information matrix is used (to compute the standard errors). If "observed", the observed information matrix is used. If "default", the value is set depending on the estimator and the mimic option.

If "standard", conventional standard errors are computed based on inverting the (expected or observed) information matrix. If "first.order", standard errors are computed based on first-order derivatives. If "robust.sem", conventional robust standard errors are computed. If "robust.huber.white", standard errors are computed based on the `mlr' (aka pseudo ML, Huber-White) approach. If "robust", either "robust.sem" or "robust.huber.white" is used depending on the estimator, the mimic option, and whether the data are complete or not. If "boot" or "bootstrap", bootstrap standard errors are computed using standard bootstrapping (unless Bollen-Stine bootstrapping is requested for the test statistic; in this case bootstrap standard errors are computed using model-based bootstrapping). If "none", no standard errors are computed.

test

If "standard", a conventional chi-square test is computed. If "Satorra.Bentler", a Satorra-Bentler scaled test statistic is computed. If "Yuan.Bentler", a Yuan-Bentler scaled test statistic is computed. If "mean.var.adjusted" or "Satterthwaite", a mean and variance adjusted test statistic is compute. If "scaled.shifted", an alternative mean and variance adjusted test statistic is computed (as in Mplus version 6 or higher). If "boot" or "bootstrap" or "Bollen.Stine", the Bollen-Stine bootstrap is used to compute the bootstrap probability value of the test statistic. If "default", the value depends on the values of other arguments.

bootstrap

Number of bootstrap draws, if bootstrapping is used.

mimic

If "Mplus", an attempt is made to mimic the Mplus program. If "EQS", an attempt is made to mimic the EQS program. If "default", the value is (currently) set to to "lavaan", which is very close to"Mplus".

representation

If "LISREL" the classical LISREL matrix representation is used to represent the model (using the all-y variant).

do.fit

If FALSE, the model is not fit, and the current starting values of the model parameters are preserved.

control

A list containing control parameters passed to the optimizer. By default, lavaan uses "nlminb". See the manpage of nlminb for an overview of the control parameters. A different optimizer can be chosen by setting the value of optim.method. For unconstrained optimization (the model syntax does not include any "==", ">" or "<" operators),="" the="" available="" options="" are="" "nlminb" (the default), "BFGS" and "L-BFGS-B". See the manpage of the optim function for the control parameters of the latter two options. For constrained optimization, the only available option is "nlminb.constr".

WLS.V

A user provided weight matrix to be used by estimator "WLS"; if the estimator is "DWLS", only the diagonal of this matrix will be used. For a multiple group analysis, a list with a weight matrix for each group. The elements of the weight matrix should be in the following order (if all data is continuous): first the means (if a meanstructure is involved), then the lower triangular elements of the covariance matrix including the diagonal, ordered column by column. In the categorical case: first the thresholds (including the means for continuous variables), then the slopes (if any), the variances of continuous variables (if any), and finally the lower triangular elements of the correlation/covariance matrix excluding the diagonal, ordered column by column.

NACOV

A user provided matrix containing the elements of (N times) the asymptotic variance-covariance matrix of the sample statistics. For a multiple group analysis, a list with an asymptotic variance-covariance matrix for each group. See the WLS.V argument for information about the order of the elements.

zero.add

A numeric vector containing two values. These values affect the calculation of polychoric correlations when some frequencies in the bivariate table are zero. The first value only applies for 2x2 tables. The second value for larger tables. This value is added to the zero frequency in the bivariate table. If "default", the value is set depending on the "mimic" option. By default, lavaan uses zero.add = c(0.5. 0.0).

zero.keep.margins

Logical. This argument only affects the computation of polychoric correlations for 2x2 tables with an empty cell, and where a value is added to the empty cell. If TRUE, the other values of the frequency table are adjusted so that all margins are unaffected. If "default", the value is set depending on the "mimic". The default is TRUE.

zero.cell.warn

Logical. Only used if some observed endogenous variables are categorical. If TRUE, give a warning if one or more cells of a bivariate frequency table are empty.

start

If it is a character string, the two options are currently "simple" and "Mplus". In the first case, all parameter values are set to zero, except the factor loadings (set to one), the variances of latent variables (set to 0.05), and the residual variances of observed variables (set to half the observed variance). If "Mplus", we use a similar scheme, but the factor loadings are estimated using the fabin3 estimator (tsls) per factor. If start is a fitted object of class lavaan, the estimated values of the corresponding parameters will be extracted. If it is a model list, for example the output of the paramaterEstimates() function, the values of the est or start or ustart column (whichever is found first) will be extracted.

slotOptions

Options slot from a fitted lavaan object. If provided, no new Options slot will be created by this call.

slotParTable

ParTable slot from a fitted lavaan object. If provided, no new ParTable slot will be created by this call.

slotSampleStats

SampleStats slot from a fitted lavaan object. If provided, no new SampleStats slot will be created by this call.

slotData

Data slot from a fitted lavaan object. If provided, no new Data slot will be created by this call.

slotModel

Model slot from a fitted lavaan object. If provided, no new Model slot will be created by this call.

slotCache

Cache slot from a fitted lavaan object. If provided, no new Cache slot will be created by this call.

check

Character vector. If check includes "start", the starting values are checked for possibly inconsistent values (for example values implying correlations larger than one); if check includes "post", a check is performed after (post) fitting, to check if the solution is admissable.

verbose

If TRUE, the function value is printed out during each iteration.

warn

If TRUE, some (possibly harmless) warnings are printed out during the iterations.

debug

If TRUE, debugging information is printed out.

Value

An object of class lavaan, for which several methods are available, including a summary method.

References

Yves Rosseel (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36. URL http://www.jstatsoft.org/v48/i02/.

Examples

Run this code

# The Holzinger and Swineford (1939) example
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit <- lavaan(HS.model, data=HolzingerSwineford1939,
              auto.var=TRUE, auto.fix.first=TRUE,
              auto.cov.lv.x=TRUE)
summary(fit, fit.measures=TRUE)

Run the code above in your browser using DataLab