The blatent model syntax provides the specifications for a Bayesian latent variable model.
The model syntax, encapsulated in quotation marks, consists of up to three components:
Model Formulae: R model-like formulae specifying the model for all observed and latent variables in the model. See formula
for
R formula specifics. Blatent model formulae differ only in that more than one variable can be provided to the left of the ~
.
In this section of syntax, there are no differences between latent and observed variables. Model statements are formed using
the linear predictor for each variable. This means that to specify a measurement model, the latent variables will appear to the right-hand side of the ~
.
Examples:
Measurement model where one latent variable (LV) predicts ten items (item1-item10, implying item1, item2, ..., item10):
item1-item10 ~ LV
One observed variable (X) predicting another observed variable (Y):
Y ~ X
Two items (itemA and itemB) measuring two latent variables (LV1, LV2) with a latent variable interaction:
itemA itemB ~ LV1 + LV2 + LV1:LV2
Two items (itemA and itemB) measuring two latent variables (LV1, LV2) with a latent variable interaction (R formula
shorthand):
itemA itemB ~ LV1*LV2
Measurement model with seven items (item1-item7) measuring three latent variables (A1, A2, A3) from Chapter 9 of Rupp, Templin, Henson (2010):
item1 ~ A1
item2 ~ A2
item3 ~ A3
item4 ~ A1 + A2 + A1:A2
item5 ~ A1 + A3 + A1:A3
item6 ~ A2 + A3 + A2:A3
item7 ~ A1 + A2 + A3 + A1:A2 + A1:A3 + A2:A3 + A1:A2:A3
Latent Variable Specifications: Latent variables are declared using a unevaluated function call to
the latent
function. Here, only the latent variables are declared along with options for their estimation.
See latent
for more information.
A1 A2 A3 <- latent(unit = 'rows', distribution = 'mvbernoulli', structure = 'joint', type = 'ordinal', jointName = 'class')
Additionally, blatent currently uses a Bayesian Inference Network style of specifying the distributional associations between latent variables: Model statements must be given to specify any associations between latent variables. By default, all latent variables are independent, which is a terrible assumption. To fix this, for instance, as shown in Hu and Templin (2020), the following syntax will give a model that is equivalent to the saturated model for a DCM:
# Structural Model
A1 ~ 1
A2 ~ A1
A3 ~ A1 + A2 + A1:A2
Observed Variable Specifications: Observed variables are declared using a unevaluated function call to
the observed
function. Here, only the observed variables are declared along with options for their estimation.
See observed
for more information.
item1-item7 <- observed(distribution = 'bernoulli', link = 'probit')
Continuing with the syntax example from above, the full syntax for the model in Chapter 9 of Rupp, Templin, Henson (2010) is:
modelText = "
# Measurement Modelitem1 ~ A1
item2 ~ A2
item3 ~ A3
item4 ~ A1 + A2 + A1:A2
item5 ~ A1 + A3 + A1:A3
item6 ~ A2 + A3 + A2:A3
item7 ~ A1 + A2 + A3 + A1:A2 + A1:A3 + A2:A3 + A1:A2:A3
# Structural Model
A1 ~ 1
A2 ~ A1
A3 ~ A1 + A2 + A1:A2
A1 A2 A3 <- latent(unit = 'rows', distribution = 'bernoulli', structure = 'univariate', type = 'ordinal')
# Observed Variable Specifications:
item1-item7 <- observed(distribution = 'bernoulli', link = 'probit')
"
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic Measurement: Theory, Methods, and Applications. New York: Guilford.
Hu, B., & Templin, J. (2020). Using diagnostic classification models to validate attribute hierarchies and evaluate model fit in Bayesian networks. Multivariate Behavioral Research. https://doi.org/10.1080/00273171.2019.1632165