This function estimates parameters of a Latent Profile Analysis (LPA) model for continuous observed variables using one of three methods: Expectation-Maximization (EM) algorithm, Neural Network Estimation (NNE), or external Mplus software. It supports flexible covariance structures and initialization strategies.
LPA(
response,
L = 2,
par.ini = "random",
constraint = "VV",
method = "EM",
is.sort = TRUE,
nrep = 20,
starts = 100,
maxiter.wa = 20,
vis = TRUE,
control.EM = NULL,
control.Mplus = NULL,
control.NNE = NULL
)An object of class "LPA" containing:
paramsList with estimated profile parameters:
means\(L \times I\) matrix of estimated mean vectors for each profile.
covs\(I \times I \times L\) array of estimated covariance matrices for each profile.
P.ZVector of length \(L\) with profile prior probabilities.
nparNumber of free parameters in the model (depends on constraint).
Log.LikLog-likelihood of the final model.
AICAkaike Information Criterion value.
BICBayesian Information Criterion value.
best_BICBest BIC value across nrep runs (if applicable).
P.Z.Xn\(N \times L\) matrix of posterior profile probabilities for each observation.
P.ZVector of length \(L\) containing the prior probabilities/structural parameters/proportions for each latent class.
ZVector of length \(N\) with MAP-classified profile memberships.
Log.Lik.historyVector tracking log-likelihood at each EM iteration (only for method="EM").
Log.Lik.nrepVector of log-likelihoods from each replication run.
modelThe optimal model object:
For method="NNE": Trained neural network model.
For method="Mplus": Estimated Mplus model.
A numeric matrix of dimension \(N \times I\), where \(N\) is the number of individuals/participants/observations
and \(I\) is the number of continuous observed items/variables. Missing values are not allowed.
Note that response must be standardized using scale or
normalize before input.
Integer specifying the number of latent profiles (default: 2).
Specification for parameter initialization. Options include:
"random": Random initialization of means and covariances (default).
"kmeans": Initializes parameters via K-means clustering on observed data (McLachlan & Peel, 2004).
A list containing exactly three elements:
meansAn \(L \times I\) matrix of initial mean vectors for each profile.
covsAn \(I \times I \times L\) array of initial covariance matrices for each profile.
P.ZA numeric vector of length \(L\) specifying initial prior probabilities for profiles.
Character string specifying covariance structure constraints:
"VV"Varying variances and varying covariances across profiles (heterogeneous full covariance; Default).
"VE"Varying variances but equal correlations across profiles.
"EV"Equal variances but varying covariances across profiles.
"EE"Equal variances and equal covariances across profiles (homogeneous full covariance).
"E0"Equal variances across profiles, zero covariances (diagonal with shared variances).
"V0"Varying variances across profiles, zero covariances (diagonal with free variances).
listCustom constraints. Each element is a 2-element integer vector specifying variables whose covariance parameters are constrained equal across all classes. The constraint applies to:
Variances: When both indices are identical (e.g., c(3,3) forces variance of variable 3 to be equal across classes).
Covariances: When indices differ (e.g., c(1,2) forces covariance between variables 1 and 2 to be equal across classes).
Constraints are symmetric (e.g., c(1,2) automatically constrains c(2,1)). All unconstrained parameters
vary freely across classes while maintaining positive definiteness.
Character string specifying estimation algorithm:
"EM": Expectation-Maximization algorithm (Default).
"NNE": Neural Network Estimation with transformer architecture (experimental; uses transformer + simulated
annealing, more reliable than both "EM" and "Mplus"). See install_python_dependencies.
"Mplus": Calls external Mplus software for estimation.
Uses Mplus defaults for optimization unless overridden by control.Mplus.
A logical value. If TRUE (Default), the latent classes will be ordered in descending
order according to P.Z. All other parameters will be adjusted accordingly
based on the reordered latent classes.
Integer controlling replication behavior:
If par.ini = "random", number of random initializations.
If par.ini = "kmeans", number of K-means runs for initialization.
For method="Mplus", controls number of random starts in Mplus via STARTS option.
Best solution is selected by log-likelihood/BIC across replications.
Ignored for user-provided initial parameters.
Number of random initializations to explore during warm-up phase (default: 100).
Maximum number of training iterations allowed per warm-up run. After completion,
the top nrep solutions (by log-likelihood) are promoted to final training phase (default: 20).
Logical. If TRUE, displays progress information during estimation (default: TRUE).
List of control parameters for EM algorithm:
maxiterMaximum iterations (default: 2000).
tolConvergence tolerance for log-likelihood difference (default: 1e-4).
List of control parameters for Mplus estimation:
maxiterMaximum iterations for Mplus optimization (default: 2000).
tolConvergence tolerance for log-likelihood difference (default: 1e-4).
files.pathCharacter string specifying the directory path where Mplus will write its intermediate files
(e.g., .inp model input, .dat data file, .out output, and saved posterior probabilities).
This argument is required — if NULL (default), the function throws an error.
The specified directory must exist and be writable; if it does not exist, the function attempts to create it recursively.
A unique timestamped subdirectory (e.g., "Mplus_LPA_YYYY-MM-DD_HH-MM-SS") will be created within this path
to store all run-specific files and avoid naming conflicts.
files.cleanLogical. If TRUE (default), all intermediate files and the temporary working directory
created for this run are deleted upon successful completion or error exit (via on.exit()).
If FALSE, all generated files are retained in files.path (or the auto-generated temp dir)
for inspection or debugging. Note: when files.path = NULL, even if files.clean = FALSE,
the temporary directory may still be cleaned up by the system later — for guaranteed persistence,
specify a custom files.path.
List of control parameters for NNE algorithm:
hidden.layersInteger vector specifying layer sizes in fully-connected network (default: c(12,12)).
activation.functionActivation function (e.g., "tanh", default: "tanh").
d.modelDimensionality of transformer encoder embeddings (default: 8).
nheadNumber of attention heads in transformer (default: 2).
dim.feedforwardDimensionality of transformer feedforward network (default: 16).
epsSmall constant for numerical stability (default: 1e-8).
lambdaA factor for slight regularization of all parameters (default: 1e-5).
initial.temperatureInitial temperature for simulated annealing (default: 1000).
cooling.rateCooling rate per iteration in simulated annealing (default: 0.5).
maxiter.saMaximum iterations for simulated annealing (default: 1000).
threshold.saMinimum temperature threshold for annealing (default: 1e-10).
maxiterMaximum training epochs (default: 1000).
maxiter.earlyPatience parameter for early stopping (default: 50).
maxcycleMaximum cycles for optimization (default: 10).
lrLearning rate, controlling the step size of neural network parameter updates (default: 0.025).
scheduler.patiencePatience for learning rate decay (if the loss function does not improve for more than patience consecutive epochs, the learning rate will be reduced) (default: 10).
scheduler.factorLearning rate decay factor; the new learning rate equals the original learning rate multiplied by scheduler.factor (default: 0.70).
plot.intervalInterval (in epochs) for plotting training diagnostics (default: 100).
deviceSpecifies the hardware device; can be "CPU" (default) or "GPU". If the GPU is not available, it automatically falls back to CPU.
When method = "EM", parameter estimation uses the Expectation-Maximization (EM) algorithm to maximize the observed-data log-likelihood:
$$ \log \mathcal{L} = \sum_{n=1}^N \log \left[ \sum_{l=1}^L \pi_l \cdot \mathcal{N}(\mathbf{x}_n \mid \boldsymbol{\mu}_l, \boldsymbol{\Sigma}_l) \right] $$
The algorithm iterates between two steps until convergence (change in log-likelihood < tol or max iterations reached):
Compute posterior class probabilities (responsibilities) for observation \(n\) and class \(l\): $$\tau_{nl} = \frac{\pi_l \cdot \mathcal{N}(\mathbf{x}_n \mid \boldsymbol{\mu}_l, \boldsymbol{\Sigma}_l)}{\sum_{k=1}^L \pi_k \cdot \mathcal{N}(\mathbf{x}_n \mid \boldsymbol{\mu}_k, \boldsymbol{\Sigma}_k)}$$ where \(\mathcal{N}(\cdot)\) is the multivariate normal density, \(\pi_l\) is the prior class probability, and \(\boldsymbol{\mu}_l\), \(\boldsymbol{\Sigma}_l\) are current parameters. Numerical stability is ensured via the log-sum-exp trick.
Update parameters using responsibilities \(\tau_{nl}\):
Class probabilities: \(\pi_l^{\text{new}} = \frac{1}{N}\sum_{n=1}^N \tau_{nl}\)
Class means: \(\boldsymbol{\mu}_l^{\text{new}} = \frac{\sum_{n=1}^N \tau_{nl} \mathbf{x}_n}{\sum_{n=1}^N \tau_{nl}}\)
Class covariances: Updated under constraints:
"VV"\(\boldsymbol{\Sigma}_l^{\text{new}} = \frac{\sum_{n=1}^N \tau_{nl} (\mathbf{x}_n - \boldsymbol{\mu}_l^{\text{new}}) (\mathbf{x}_n - \boldsymbol{\mu}_l^{\text{new}})^\top}{\sum_{n=1}^N \tau_{nl}}\)
"EE"Shared covariance: \(\boldsymbol{\Sigma}^{\text{new}} = \frac{\sum_{l=1}^L \sum_{n=1}^N \tau_{nl} (\mathbf{x}_n - \boldsymbol{\mu}_l^{\text{new}}) (\mathbf{x}_n - \boldsymbol{\mu}_l^{\text{new}})^\top}{\sum_{l=1}^L \sum_{n=1}^N \tau_{nl}}\)
"VE" (default) / "EV"Hybrid constraints (e.g., "VE": varying variances, equal correlations).
Off-diagonal elements use weighted averages across classes; diagonals retain class-specific values.
User-specified variances/covariances (e.g., list(c(1,2), c(2, 2)), meaning the
covariates of observed variable 1 and observed variable 2 are equal across latent classes,
and the variance of observed variable 2 is equal across classes) are forced equal across
classes via weighted averaging.
Covariance matrices are regularized to ensure positive definiteness:
Eigenvalues < jitter (1e-10) are replaced with jitter
Failed Cholesky decompositions trigger diagonal jittering or perturbation of non-constrained elements
Edge Handling:
Empty classes (\(\sum_n \tau_{nl} < 10^{-5}\)) are reinitialized by redistributing responsibilities.
Non-finite likelihoods trigger fallback to previous valid parameters or covariance perturbation.
Univariate cases (\(I=1\)) bypass Cholesky decomposition for direct variance updates.
When method = "NNE", parameters are estimated using a hybrid neural network architecture
combining fully-connected layers with transformer-based attention mechanisms. This approach jointly
optimizes profile parameters and posterior probabilities through stochastic optimization with
simulated annealing. See install_python_dependencies. Key components include:
Architecture:
Continuous observed variables \(\mathbf{x}_n \in \mathbb{R}^I\) are standardized (mean-centered, unit-variance) internally during training to improve numerical stability. No encoding is needed — raw values are fed directly.
A multi-layer perceptron with architecture defined by hidden.layers and activation.function
maps the continuous input vector into a latent space of dimension d.model. This layer learns non-linear
feature combinations predictive of latent profile membership.
A transformer encoder with nhead attention heads that learns latent class prior probabilities
\(\boldsymbol{\pi} = (\pi_1, \pi_2, \dots, \pi_L)\) directly from observed responses.
Input: response matrix (\(N \times I\)), where \(N\) = observations, \(I\) = continuous variables.
Mechanism: Self-attention dynamically weighs variable importance during profile assignment, capturing complex multivariate interactions.
Output: Class prior vector \(\boldsymbol{\pi}\) computed as the mean of posteriors: $$\pi_l = \frac{1}{N}\sum_{n=1}^N attention(\mathbf{X}_n)$$ This ensures probabilistic consistency with the mixture model framework.
Two separate projection heads branch from the transformer output:
Means Head: Linear projection to \(L \times I\) matrix \(\boldsymbol{\mu}_l\).
Covariance Head: Outputs lower-triangular elements of Cholesky factors \(\mathbf{L}_l\)
for each profile. Diagonal elements are passed through softplus to ensure positivity;
off-diagonals use tanh scaled by 1.2 to bound magnitude and promote stability.
The full covariance is reconstructed via \(\boldsymbol{\Sigma}_l = \mathbf{L}_l \mathbf{L}_l^\top\).
After reconstruction, covariance constraints (e.g., "EE", "V0", or custom lists) are applied by
averaging constrained elements across profiles and re-symmetrizing.
Optimization Strategy:
Hybrid Training Protocol: Alternates between:
Gradient-based phase: AdamW optimizer minimizes negative log-likelihood with weight decay regularization:
$$-\log \mathcal{L} + \lambda \|\boldsymbol{\theta}\|_2^2$$
where \(\lambda\) is controlled by lambda (default: 1e-5). Learning rate decays adaptively when
loss plateaus (controlled by scheduler.patience and scheduler.factor).
Simulated annealing phase: After gradient-based early stopping (maxiter.early), parameters
are perturbed with noise scaled by temperature:
$$\theta_{\text{new}} = \theta_{\text{current}} + \mathcal{N}(0, \theta_{\text{current}} \times \frac{T}{T_0})$$
Temperature \(T\) decays geometrically (\(T \leftarrow T \times \text{cooling.rate}\)) from
initial.temperature until threshold.sa is reached. This escapes poor local minima.
Each full cycle (gradient descent + annealing) repeats up to maxcycle times.
Model Selection: Across nrep random restarts (using Dirichlet-distributed initializations
or K-means), the solution with lowest BIC is retained.
Diagnostics: Training loss, annealing points, and global best solution are plotted when vis=TRUE.
Constraint Handling:
Covariance constraints (constraint) are enforced after activation via:
Shared Parameters: Variances/covariances marked for equality are replaced by their average across profiles.
Positive Definiteness: Non-positive definite matrices are corrected via eigenvalue clamping, diagonal jittering, or adaptive Cholesky decomposition.
Custom constraints: e.g., list(c(1,2), c(3,3)), force equality of specific covariance elements
across profiles, with symmetry (\(\sigma_{12} = \sigma_{21}\)) automatically enforced.
When method = "Mplus", estimation is delegated to external Mplus software.
The function automates the entire workflow:
Workflow:
Creates inst/Mplus to store:
Mplus input syntax (.inp)
Data file in Mplus format (.dat)
Posterior probabilities output (.dat)
Files are automatically deleted after estimation unless control.Mplus$clean.files = FALSE.
Constructs Mplus syntax with:
CLASSES = c1(L) specification for \(L\) latent classes
ANALYSIS block with optimization controls:
TYPE = mixtureStandard mixture modeling setup
STARTS = starts nrepRandom starts and final stage optimizations
STITERATIONS = maxiter.wamax itertions during starts.
MITERATIONS = maxiterMaximum EM iterations
CONVERGENCE = tolLog-likelihood convergence tolerance
MODEL block reflecting the specified constraint structure
Calls Mplus via MplusAutomation::mplusModeler()
Writes data to disk in Mplus-compatible format Invokes the Mplus executable (requires valid license) Captures convergence status and output
Constraint Handling:
Covariance constraints (constraint) are enforced after activation via:
Shared Parameters: Variances/covariances marked for equality are replaced by their average across profiles.
Positive Definiteness: Non-positive definite matrices are corrected via eigenvalue clamping, diagonal jittering, or adaptive Cholesky decomposition.
Custom constraints: e.g., list(c(1,2), c(3,3)), force equality of specific covariance elements
across profiles, with symmetry (\(\sigma_{12} = \sigma_{21}\)) automatically enforced.
McLachlan, G. J., & Peel, D. (2004). Finite Mixture Models. Wiley. https://books.google.com.sg/books?id=c2_fAox0DQoC
# Simulate bivariate continuous data for 2 profiles
set.seed(123)
data.obj <- sim.LPA(N = 500, I = 3, L = 2, constraint = "VV")
response <- data.obj$response
## It is strongly recommended to perform the following
## standardization to obtain more stable results.
## Standardization is not performed here in order to
## compare estimated values with true values.
# response <- normalize(response)
# Fit 2-profile model with VV constraint (default)
fit_vv <- LPA(response, L = 2, constraint = "VV")
# Fit 2-profile model with E0 constraint using neural network estimation
# need Python
if (FALSE) {
fit_e0_nne <- LPA(response, L = 2, constraint = "E0", method = "NNE", nrep = 2)
}
# Fit 2-profile model using Mplus
# Requires Mplus to be installed and available in system PATH.
# NOTE: 'files.path' in control.Mplus is REQUIRED — the function will
# throw an error if not provided.
# This example creates a timestamped subdirectory
# (e.g., "Mplus_LPA_YYYY-MM-DD_HH-MM-SS") under './inst'
# to store all temporary Mplus files (.inp, .dat, .out, etc.).
# The 'inst' directory will be created if it does not exist.
# Setting files.clean=FALSE means temporary files will be preserved after execution.
if (FALSE) {
fit_mplus <- LPA(response, L = 2, method = "Mplus", constraint = list(c(1, 2), c(3, 3)),
control.Mplus = list(files.path = "inst", files.clean=FALSE))
}
Run the code above in your browser using DataLab