Learn R Programming

Latent Variable Models: lava

A general implementation of Structural Equation Models with latent variables (MLE, 2SLS, and composite likelihood estimators) with both continuous, censored, and ordinal outcomes (Holst and Budtz-Joergensen (2013) <10.1007/s00180-012-0344-y>). Mixture latent variable models and non-linear latent variable models (Holst and Budtz-Joergensen (2020) <10.1093/biostatistics/kxy082>). The package also provides methods for graph exploration (d-separation, back-door criterion), simulation of general non-linear latent variable models, and estimation of influence functions for a broad range of statistical models.

Installation

install.packages("lava", dependencies=TRUE)
library("lava")
demo("lava")

For graphical capabilities the Rgraphviz package is needed (first install the BiocManager package)

# install.packages("BiocManager")
BiocManager::install("Rgraphviz")

or the igraph or visNetwork packages

install.packages("igraph")
install.packages("visNetwork")

The development version of lava may also be installed directly from github:

# install.packages("remotes")
remotes::install_github("kkholst/lava")

Citation

To cite that lava package please use one of the following references

Klaus K. Holst and Esben Budtz-Joergensen (2013). Linear Latent Variable Models: The lava-package. Computational Statistics 28 (4), pp 1385-1453. http://dx.doi.org/10.1007/s00180-012-0344-y

@article{lava,
  title = {Linear Latent Variable Models: The lava-package},
  author = {Klaus Kähler Holst and Esben Budtz-Jørgensen},
  year = {2013},
  volume = {28},
  number = {4},
  pages = {1385-1452},
  journal = {Computational Statistics},
  doi = {10.1007/s00180-012-0344-y}
}

Klaus K. Holst and Esben Budtz-Jørgensen (2020). A two-stage estimation procedure for non-linear structural equation models. Biostatistics 21 (4), pp 676-691. http://dx.doi.org/10.1093/biostatistics/kxy082

@article{lava_nlin,
  title = {A two-stage estimation procedure for non-linear structural equation models},
  author = {Klaus Kähler Holst and Esben Budtz-Jørgensen},
  journal = {Biostatistics},
  year = {2020},
  volume = {21},
  number = {4},
  pages = {676-691},
  doi = {10.1093/biostatistics/kxy082},
}

Examples

Structural Equation Model

Specify structural equation models with two factors

m <- lvm()
regression(m) <- y1 + y2 + y3 ~ eta1
regression(m) <- z1 + z2 + z3 ~ eta2
latent(m) <- ~ eta1 + eta2
regression(m) <- eta2 ~ eta1 + x
regression(m) <- eta1 ~ x

labels(m) <- c(eta1=expression(eta[1]), eta2=expression(eta[2]))
plot(m)

Simulation

d <- sim(m, 100, seed=1)

Estimation

e <- estimate(m, d)
e
#>                     Estimate Std. Error  Z-value   P-value
#> Measurements:                                             
#>    y2~eta1           0.95462    0.08083 11.80993    <1e-12
#>    y3~eta1           0.98476    0.08922 11.03722    <1e-12
#>     z2~eta2          0.97038    0.05368 18.07714    <1e-12
#>     z3~eta2          0.95608    0.05643 16.94182    <1e-12
#> Regressions:                                              
#>    eta1~x            1.24587    0.11486 10.84694    <1e-12
#>     eta2~eta1        0.95608    0.18008  5.30910 1.102e-07
#>     eta2~x           1.11495    0.25228  4.41951 9.893e-06
#> Intercepts:                                               
#>    y2               -0.13896    0.12458 -1.11537    0.2647
#>    y3               -0.07661    0.13869 -0.55241    0.5807
#>    eta1              0.15801    0.12780  1.23644    0.2163
#>    z2               -0.00441    0.14858 -0.02969    0.9763
#>    z3               -0.15900    0.15731 -1.01076    0.3121
#>    eta2             -0.14143    0.18380 -0.76949    0.4416
#> Residual Variances:                                       
#>    y1                0.69684    0.14858  4.69004          
#>    y2                0.89804    0.16630  5.40026          
#>    y3                1.22456    0.21182  5.78109          
#>    eta1              0.93620    0.19623  4.77084          
#>    z1                1.41422    0.26259  5.38570          
#>    z2                0.87569    0.19463  4.49934          
#>    z3                1.18155    0.22640  5.21883          
#>    eta2              1.24430    0.28992  4.29195

Model assessment

Assessing goodness-of-fit, here the linearity between eta2 and eta1 (requires the gof package)

# install.packages("gof", repos="https://kkholst.github.io/r_repo/")
library("gof")
set.seed(1)
g <- cumres(e, eta2 ~ eta1)
plot(g)

Non-linear measurement error model

Simulate non-linear model

m <- lvm(y1 + y2 + y3 ~ u, u ~ x)
transform(m,u2 ~ u) <- function(x) x^2
regression(m) <- z~u2+u

d <- sim(m,200,p=c("z"=-1, "z~u2"=-0.5), seed=1)

Stage 1:

m1 <- lvm(c(y1[0:s], y2[0:s], y3[0:s]) ~ 1*u, u ~ x)
latent(m1) <- ~ u
(e1 <- estimate(m1, d))
#>                     Estimate Std. Error  Z-value  P-value
#> Regressions:                                             
#>    u~x               1.06998    0.08208 13.03542   <1e-12
#> Intercepts:                                              
#>    u                -0.08871    0.08753 -1.01344   0.3108
#> Residual Variances:                                      
#>    y1                1.00054    0.07075 14.14214         
#>    u                 1.19873    0.15503  7.73233

Stage 2

pp <- function(mu,var,data,...) cbind(u=mu[,"u"], u2=mu[,"u"]^2+var["u","u"])
(e <- measurement.error(e1, z~1+x, data=d, predictfun=pp))
#>             Estimate Std.Err    2.5%   97.5%   P-value
#> (Intercept)  -1.1181 0.13795 -1.3885 -0.8477 5.273e-16
#> x            -0.0537 0.13213 -0.3127  0.2053 6.844e-01
#> u             1.0039 0.11504  0.7785  1.2294 2.609e-18
#> u2           -0.4718 0.05213 -0.5740 -0.3697 1.410e-19
f <- function(p) p[1]+p["u"]*u+p["u2"]*u^2
u <- seq(-1, 1, length.out=100)
plot(e, f, data=data.frame(u))

Simulation

Studying the small-sample properties of a mediation analysis

m <- lvm(y~x, c~1)
regression(m) <- y+x ~ z
eventTime(m) <- t~min(y=1, c=0)
transform(m,S~t+status) <- function(x) survival::Surv(x[,1],x[,2])
plot(m)

Simulate from model and estimate indirect effects

onerun <- function(...) {
    d <- sim(m, 100)
    m0 <- lvm(S~x+z, x~z)
    e <- estimate(m0, d, estimator="glm")
    vec(summary(effects(e, S~z))$coef[,1:2])
}
val <- sim(onerun, 100)
summary(val, estimate=1:4, se=5:8, short=TRUE)
#> 100 replications					Time: 3.667s
#> 
#>         Total.Estimate Direct.Estimate Indirect.Estimate S~x~z.Estimate
#> Mean           1.97292         0.96537           1.00755        1.00755
#> SD             0.16900         0.18782           0.15924        0.15924
#> SE             0.18665         0.18090           0.16431        0.16431
#> SE/SD          1.10446         0.96315           1.03183        1.03183
#>                                                                        
#> Min            1.47243         0.54497           0.54554        0.54554
#> 2.5%           1.63496         0.61228           0.64914        0.64914
#> 50%            1.95574         0.97154           0.99120        0.99120
#> 97.5%          2.27887         1.32443           1.27807        1.27807
#> Max            2.45746         1.49491           1.33446        1.33446
#>                                                                        
#> Missing        0.00000         0.00000           0.00000        0.00000

Add additional simulations and visualize results

val <- sim(val,500) ## Add 500 simulations
plot(val, estimate=c("Total.Estimate", "Indirect.Estimate"),
     true=c(2, 1), se=c("Total.Std.Err", "Indirect.Std.Err"),
     scatter.plot=TRUE)

Copy Link

Version

Install

install.packages('lava')

Monthly Downloads

144,733

Version

1.8.2

License

GPL-3

Maintainer

Klaus Holst

Last Published

October 30th, 2025

Functions in lava (1.8.2)

calcium

Longitudinal Bone Mineral Density Data
backdoor

Backdoor criterion
addvar

Add variable to (model) object
brisa

Simulated data
correlation

Generic method for extracting correlation coefficients of model object
bmidata

Data
Range.lvm

Define range constraints of parameters
Print

Generic print method
bootstrap

Generic bootstrap method
PD

Dose response calculation for binomial regression models
NR

Newton-Raphson method
colorbar

Add color-bar to plot
closed_testing

Closed testing procedure
covariance

Add covariance structure to Latent Variable Model
confpred

Conformal prediction
commutation

Finds the unique commutation matrix
equivalence

Identify candidates of equivalent models
compare

Statistical tests
cancel

Generic cancel method
confint.lvmfit

Calculate confidence limits for parameters
estimate.default

Estimation of functional of parameters
diagtest

Calculate diagnostic tests for 2x2 table
getMplus

Read Mplus output
dsep.lvm

Check d-separation criterion
estimate.array

Estimate parameters and influence function.
images

Organize several image calls (for visualizing categorical data)
curly

Adds curly brackets to plot
hubble

Hubble data
csplit

Split data into folds
children

Extract children or parent elements of object
click

Identify points on plot
intercept

Fix mean parameters in 'lvm'-object
constrain<-

Add non-linear constraints to latent variable model
contr

Create contrast matrix
gof

Extract model summaries and GOF statistics for model object
devcoords

Returns device-coordinates and plot-region
deprdiag

50 patients from Monash Medical Centre, Melbourne
ordinal<-

Define variables as ordinal
getSAS

Read SAS output
estimate.lvm

Estimation of parameters in a Latent Variable Model (lvm)
ksmooth2

Plot/estimate surface
%ni%

Matching operator (x not in y) oposed to the %in%-operator (x in y)
labels<-

Define labels of graph
lava-package

lava: Latent Variable Models
intervention.lvm

Define intervention
iid

Extract i.i.d. decomposition from model object
indoorenv

Data
hubble2

Hubble data
startvalues

For internal use
multinomial

Estimate probabilities in contingency table
parpos

Generic method for finding indeces of model parameters
ordreg

Univariate cumulative link regression models
partialcor

Calculate partial correlations
missingdata

Missing data example
modelsearch

Model searching
lava.options

Set global options for lava
pcor

Polychoric correlation
lvm

Initialize new latent variable model
mixture

Estimate mixture latent variable model.
predict.lvm

Prediction in structural equation models
predictlvm

Predict function for latent variable models
pdfconvert

Convert pdf to raster format
nsem

Example SEM data (nonlinear)
revdiag

Create/extract 'reverse'-diagonal matrix or off-diagonal elements
fplot

fplot
plot.estimate

Plot method for 'estimate' objects
%++%

Concatenation operator
confband

Add Confidence limits bar to plot
makemissing

Create random missing data
complik

Composite Likelihood for probit latent variable models
measurement.error

Two-stage (non-linear) measurement error
eventTime

Add an observed event time outcome to a latent variable model.
summary.sim

Summary method for 'sim' objects
trim

Trim string of (leading/trailing/all) white spaces
plot.lvm

Plot path diagram
rbind.Surv

Appending Surv objects
semdata

Example SEM data
scheffe

Calculate simultaneous confidence limits by Scheffe's method
regression<-

Add regression association to latent variable model
path

Extract pathways in model graph
twostageCV

Cross-validated two-stage estimator
twostage.lvmfit

Two-stage estimator (non-linear SEM)
twindata

Twin menarche data
plotConf

Plot regression lines
plot.sim

Plot method for simulation 'sim' objects
mvnmix

Estimate mixture latent variable model
sim.default

Monte Carlo simulation
spaghetti

Spaghetti plot
timedep

Time-dependent parameters
subset.lvm

Extract subset of latent variable model
tr

Trace operator
rotate2

Performs a rotation in the plane
nldata

Example data (nonlinear model)
serotonin

Serotonin data
twostage

Two-stage estimator
rmvar

Remove variables from (model) object.
sim.lvm

Simulate model
wkm

Weighted K-means
stack.estimate

Stack estimating equations
vec

vec operator
wait

Wait for user input (keyboard or mouse)
vars

Extract variable names from latent variable model
toformula

Converts strings to formula
wrapvec

Wrap vector
zibreg

Regression model for binomial data with unkown group of immortals
IC

Extract i.i.d. decomposition (influence function) from model object
Combine

Report estimates across different models
Graph

Extract graph
By

Apply a Function to a Data Frame Split by Factors
binomial.rd

Define constant risk difference or relative risk association for binary exposure
baptize

Label elements of object
Col

Generate a transparent RGB color
Model

Extract model
NA2x

Convert to/from NA
blockdiag

Combine matrices to block diagonal structure
Grep

Finds elements in vector or column-names in data.frame/matrix
bmd

Longitudinal Bone Mineral Density Data (Wide format)
bootstrap.lvm

Calculate bootstrap estimates of a lvm object
Expand

Create a Data Frame from All Combinations of Factors
Missing

Missing value generator