LCA: Fit Latent Class Analysis Models

Description

This function estimates parameters of a Latent Class Analysis (LCA; Hagenaars & McCutcheon, 2002) model using either the Expectation-Maximization (EM) algorithm or Neural Network Estimation (NNE). It supports flexible initialization strategies and provides comprehensive model diagnostics.

Usage

LCA(
  response,
  L = 2,
  par.ini = "random",
  method = "EM",
  is.sort = TRUE,
  nrep = 20,
  starts = 100,
  maxiter.wa = 20,
  vis = TRUE,
  control.EM = NULL,
  control.Mplus = NULL,
  control.NNE = NULL
)

Value

An object of class "LCA" containing:

params

List with estimated parameters:

par: $L \times I \times K_{\max}$ array of conditional response probabilities per latent class.
P.Z: Vector of length $L$ with latent class prior probabilities.

npar

Number of free parameters in the model. see get.npar.LCA

Log.Lik

Log-likelihood of the final model. see get.Log.Lik.LCA

AIC

Akaike Information Criterion value.

BIC

Bayesian Information Criterion value.

best_BIC

Best BIC value across nrep runs (if applicable).

P.Z.Xn

$N \times L$ matrix of posterior class probabilities for each observation.

P.Z

Vector of length $L$ containing the prior probabilities/structural parameters/proportions for each latent class.

Z

Vector of length $N$ with MAP-classified latent class memberships.

probability

Same as params$par (redundant storage for convenience).

Log.Lik.history

Vector tracking log-likelihood at each EM iteration.

Log.Lik.nrep

Vector of log-likelihoods from each replication run.

model

The optimal neural network model object (only for method="NNE"). Contains the trained transformer architecture corresponding to best_loss. This object can be used for further predictions or model inspection.

arguments

A list containing all input arguments

Arguments

response

A numeric matrix of dimension $N \times I$, where $N$ is the number of individuals/participants/observations and $I$ is the number of observed categorical items/variables. Each column must contain nominal-scale discrete responses (e.g., integers representing categories).

L

Integer specifying the number of latent classes (default: 2).

par.ini

Specification for parameter initialization. Options include:

"random": Completely random initialization (default).
"kmeans": Initializes parameters via K-means clustering on observed data (McLachlan & Peel, 2004).
A list containing:

par

An $L \times I \times K_{\max}$ array of initial conditional probabilities for each latent class, item, and response category (where $K_{\max}$ is the maximum number of categories across items).

P.Z

A numeric vector of length $L$ specifying initial prior probabilities for latent classes.

method

Character string specifying estimation algorithm:

"EM": Expectation-Maximization algorithm (default).
"NNE": Neural Network Estimation with transformer architecture (experimental; uses transformer + simulated annealing, more reliable than both "EM" and "Mplus"). See install_python_dependencies.
"Mplus": Calls external Mplus software for estimation. Uses Mplus defaults for optimization unless overridden by control.Mplus.

is.sort

A logical value. If TRUE (Default), the latent classes will be ordered in descending order according to P.Z. All other parameters will be adjusted accordingly based on the reordered latent classes.

nrep

Integer controlling replication behavior:

If par.ini = "random", number of random initializations.
If par.ini = "kmeans", number of K-means runs for initialization.
For method="Mplus", controls number of random starts in Mplus via STARTS option.
Best solution is selected by log-likelihood/BIC across replications.
Ignored for user-provided initial parameters.

starts

Number of random initializations to explore during warm-up phase (default: 100).

maxiter.wa

Maximum number of training iterations allowed per warm-up run. After completion, the top nrep solutions (by log-likelihood) are promoted to final training phase (default: 20).

vis

Logical. If TRUE, displays progress information during estimation (default: TRUE).

control.EM

List of control parameters for EM algorithm:

maxiter: Maximum iterations (default: 2000).

tol

Convergence tolerance for log-likelihood difference (default: 1e-4).

control.Mplus

List of control parameters for Mplus estimation:

maxiter: Maximum iterations for Mplus optimization (default: 2000).

tol

Convergence tolerance for log-likelihood difference (default: 1e-4).

files.path

A character string specifying the directory path where Mplus will write its intermediate files (e.g., .inp model input, .dat data file, .out output, and saved posterior probabilities). This argument is required — if NULL (the default), the function throws an error. The specified directory must exist and be writable; if it does not exist, the function attempts to create it recursively. A unique timestamped subdirectory (e.g., "Mplus_LCA_YYYY-MM-DD_HH-MM-SS") will be created within this path to store all run-specific files and avoid naming conflicts. If it is an empty string (""), the timestamped subdirectory "Mplus_LCA_YYYY-MM-DD_HH-MM-SS" will be created directly under R's current working directory (getwd()).

files.clean

Logical. If TRUE (default), all intermediate files and the temporary working directory created for this run are deleted upon successful completion or error exit (via on.exit()). If FALSE, all generated files are retained in files.path (or the auto-generated temp dir) for inspection or debugging. Note: when files.path = NULL, even if files.clean = FALSE, the temporary directory may still be cleaned up by the system later — for guaranteed persistence, specify a custom files.path.

control.NNE

List of control parameters for NNE algorithm:

hidden.layers: Integer vector specifying layer sizes in fully-connected network (default: c(12,12)).

activation.function

Activation function (e.g., "tanh", default: "tanh").

d.model

Dimensionality of transformer encoder embeddings (default: 8).

nhead

Number of attention heads in transformer (default: 2).

dim.feedforward

Dimensionality of transformer feedforward network (default: 16).

eps

Small constant for numerical stability (default: 1e-8).

lambda

A factor for slight regularization of all parameters (default: 1e-5).

initial.temperature

Initial temperature for simulated annealing (default: 1000).

cooling.rate

Cooling rate per iteration in simulated annealing (default: 0.5).

maxiter.sa

Maximum iterations for simulated annealing (default: 1000).

threshold.sa

Minimum temperature threshold for annealing (default: 1e-10).

maxiter

Maximum training epochs (default: 1000).

maxiter.early

Patience parameter for early stopping (default: 50).

maxcycle

Maximum cycles for optimization (default: 10).

lr

Learning rate, controlling the step size of neural network parameter updates (default: 0.025).

scheduler.patience

Patience for learning rate decay (if the loss function does not improve for more than patience consecutive epochs, the learning rate will be reduced) (default: 10).

scheduler.factor

Learning rate decay factor; the new learning rate equals the original learning rate multiplied by scheduler.factor (default: 0.70).

plot.interval

Interval (in epochs) for plotting training diagnostics (default: 100).

device

Specifies the hardware device; can be "CPU" (default) or "GPU". If the GPU is not available, it automatically falls back to CPU.

EM Algorithm

When method = "EM", parameters are estimated via the Expectation-Maximization algorithm, which iterates between:

E-step: Compute posterior class probabilities given current parameters: $$P(Z_n = l \mid \mathbf{X}_n) = \frac{\pi_l \prod_{i=1}^I P(X_{ni} = x_{ni} \mid Z_n=l)}{\sum_{k=1}^L \pi_k \prod_{i=1}^I P(X_{ni} = x_{ni} \mid Z_n=k)}$$ where $x_{ni}$ is the standardized (0-based) response for person $n$ on item $i$ (see adjust.response).
M-step: Update parameters by maximizing expected complete-data log-likelihood:
- Class probabilities: $\pi_l^{\text{new}} = \frac{1}{N} \sum_{n=1}^N P(Z_n = l \mid \mathbf{X}_n)$
- Conditional probabilities: $P(X_i = k \mid Z=l)^{\text{new}} = \frac{\sum_{n:x_{ni}=k} P(Z_n = l \mid \mathbf{X}_n)}{\sum_{n=1}^N P(Z_n = l \mid \mathbf{X}_n)}$
Convergence: Stops when $|\log\mathcal{L}^{(t)} - \log\mathcal{L}^{(t-1)}| < \texttt{tol}$ or maximum iterations reached.

Neural Network Estimation (NNE)

When method = "NNE", parameters are estimated using a hybrid neural network architecture that combines feedforward layers with transformer-based attention mechanisms. This approach jointly optimizes profile parameters and posterior probabilities through stochastic optimization enhanced with simulated annealing. See install_python_dependencies. Key components include:

Architecture:

Input Representation

Observed categorical responses are converted to 0-based integer indices per item (not one-hot encoded). For example, original responses $[1, 2, 4]$ become $[0, 1, 2]$.

Feature Estimator (Feedforward Network)

A fully-connected neural network with layer sizes specified by hidden.layers and activation function activation.function processes the integer-indexed responses. This network outputs unnormalized logits for posterior class membership ($N \times L$ matrix).

Attention Refiner (Transformer Encoder)

A transformer encoder with nhead attention heads that learns latent class prior probabilities $\boldsymbol{\pi} = (\pi_1, \pi_2, \dots, \pi_L)$ directly from observed responses.

Input: response matrix ($N \times I$), where $N$ = observations, $I$ = continuous variables.
Mechanism: Self-attention dynamically weighs variable importance during profile assignment, capturing complex multivariate interactions.
Output: Class prior vector $\boldsymbol{\pi}$ computed as the mean of posteriors: $$\pi_l = \frac{1}{N}\sum_{n=1}^N attention(\mathbf{X}_n)$$ This ensures probabilistic consistency with the mixture model framework.

Profile Parameter Estimation

Global conditional probability parameters ($P(X_i = k \mid Z = l)$) are stored as learnable parameters par (an $L \times I \times K_{\max}$ tensor). A masked softmax is applied along categories to enforce:

Probabilities sum to 1 within each item-class pair
Non-existent categories (beyond item's actual max response) are masked to zero probability

Optimization Strategy:

Hybrid Training Protocol: Alternates between:
- Gradient-based phase: AdamW optimizer minimizes negative log-likelihood with weight decay regularization: $$-\log \mathcal{L} + \lambda \|\boldsymbol{\theta}\|_2^2$$ where $\lambda$ is controlled by lambda (default: 1e-5). Learning rate decays adaptively when loss plateaus (controlled by scheduler.patience and scheduler.factor).
- Simulated annealing phase: After gradient-based early stopping (maxiter.early), parameters are perturbed with noise scaled by temperature: $$\theta_{\text{new}} = \theta_{\text{current}} + \mathcal{N}(0, \theta_{\text{current}} \times \frac{T}{T_0})$$ Temperature $T$ decays geometrically ($T \leftarrow T \times \text{cooling.rate}$) from initial.temperature until threshold.sa is reached. This escapes poor local minima.
Each full cycle (gradient descent + annealing) repeats up to maxcycle times.
Model Selection: Across nrep random restarts (using Dirichlet-distributed initializations or K-means), the solution with lowest BIC is retained.
Diagnostics: Training loss, annealing points, and global best solution are plotted when vis=TRUE.

Mplus

When method = "Mplus", estimation is delegated to external Mplus software. The function automates the entire workflow:

Workflow:

Temporary Directory Setup

Creates inst/Mplus to store:

Mplus input syntax (.inp)
Data file in Mplus format (.dat)
Posterior probabilities output (.dat)

Files are automatically deleted after estimation unless control.Mplus$clean.files = FALSE.

Syntax Generation

Constructs Mplus syntax with:

CLASSES = c1(L) specification for $L$ latent classes
CATEGORICAL declaration for all indicator variables
ANALYSIS block with optimization controls:

TYPE = mixture

Standard mixture modeling setup

STARTS = starts nrep

Random starts and final stage optimizations

STITERATIONS = maxiter.wa

max itertions during starts.

MITERATIONS = maxiter

Maximum EM iterations

CONVERGENCE = tol

Log-likelihood convergence tolerance

MODEL block with %OVERALL%

Execution

Calls Mplus via MplusAutomation::mplusModeler(), which:

Converts R data to Mplus-compatible format with automatic recoding
Invokes Mplus executable (requires valid license and system PATH configuration)

References

Hagenaars, J. A. , & McCutcheon, A. L. (2002). Applied Latent Class Analysis. United Kingdom: Cambridge University Press.

McLachlan, G. J., & Peel, D. (2004). Finite Mixture Models. Wiley. https://books.google.com.sg/books?id=c2_fAox0DQoC

Examples

Run this code

# Example with simulated data
set.seed(123)
data.obj <- sim.LCA(N = 500, I = 4, L = 2, IQ=0.9)
response <- data.obj$response

# Fit 2-class model with EM algorithm
# \donttest{
fit.em <- LCA(response, L = 2, method = "EM", nrep = 10)
# }

# Fit 2-profile model using Mplus
# need Mplus
# NOTE: 'files.path' in control.Mplus is REQUIRED — function will error if not provided.
# Example creates a timestamped subfolder (e.g., "Mplus_LCA_YYYY-MM-DD_HH-MM-SS") under './'
# to store all temporary Mplus files (.inp, .dat, .out, etc.).
if (FALSE) {
fit.mplus <- LCA(response, L = 2, method = "Mplus", nrep = 3,
                 control.Mplus = list(files.path = ""))
}

# Fit 2-class model with neural network estimation
# need Python
if (FALSE) {
fit.nne <- LCA(response, L = 2, method = "NNE", nrep = 3)
}

Run the code above in your browser using DataLab