LRT.test.Bootstrap: Bootstrap Likelihood Ratio Test

Description

Conducts a bootstrap likelihood ratio test to compare the fit of two nested models. This test evaluates whether a model with more parameters provides a significantly better fit than a model with fewer parameters by approximating the null distribution through parametric bootstrapping.

Usage

LRT.test.Bootstrap(object1, object2, n.Bootstrap = 100, vis = TRUE)

Value

An object of class "htest" containing:

statistic: Observed likelihood ratio test statistic
parameter: Degrees of freedom (reported as NA since p-value is bootstrap-derived)
p.value: Bootstrap p-value
method: Name of the test ("Bootstrap Likelihood Ratio Test")
data.name: Model comparison description

Arguments

object1: Fitted model object with fewer parameters (i.e., fewer npar, small model).
object2: Fitted model object with more parameters (i.e., more npar, large model).
n.Bootstrap: Number of bootstrap replicates (default = 100). Higher values increase accuracy but computation time (we suggest that n.Bootstrap = 1000).
vis: Logical. If TRUE, displays progress information during bootstrap (default: TRUE).

Details

Note that even the result of LRT.test.Bootstrap should not be taken as the sole criterion; fit indices (e.g., get.fit.index) and classification accuracy measures (e.g., get.entropy, get.AvePP) must be considered together. Above all and the most important criterion, is that the better model is the one that aligns with theoretical expectations and offers clear interpretability.

The LRT.test.Bootstrap statistic is calculated as: $$LRT = -2 \times (\text{LogLik}_{1} - \text{LogLik}_{2})$$ where:

$\text{LogLik}_{1}$: Log-likelihood of the smaller model (fewer parameters).
$\text{LogLik}_{2}$: Log-likelihood of the larger model (more parameters).

The LRT.test.Bootstrap function employs a parametric bootstrap procedure to empirically estimate the distribution of the LRT statistic under the null hypothesis (that the smaller model is sufficient). The specific steps are as follows:

Parameter Extraction: The estimated parameters (params) from the smaller model (object1) are treated as the true population values (ground truth).
Data Simulation: The function invokes sim.LCA or sim.LPA to generate n.Bootstrap independent datasets. Each dataset maintains the same sample size (N) and number of indicators (I) as the original empirical data.
Model Re-fitting: For each simulated dataset, both the small model and the large model are re-fitted. To ensure consistency, the estimation settings (e.g., convergence criteria, iterations) are identical to those used for the original models, with the exception that LRT.test.Bootstrap forces par.ini = "random" to avoid local maxima.
Distribution Generation: This process generates n.Bootstrap pairs of $\text{LogLik}_{1, boot}$ and $\text{LogLik}_{2, boot}$, which are used to compute a collection of bootstrap LRT statistics: $LRT_{boot} = -2 \times (\text{LogLik}_{1, boot} - \text{LogLik}_{2, boot})$.
P-value Calculation: The bootstrap $p$-value is calculated as the proportion of simulated $LRT_{boot}$ values that are greater than or equal to the observed $LRT$ statistic from the original data.

This method is particularly recommended for Latent Class and Latent Profile Analysis because the traditional Chi-square distribution for LRT often fails to hold due to parameters being on the boundary of the parameter space (e.g., probabilities near 0 or 1).