get.CEP: Compute Classification Error Probability (CEP) Matrices

Description

Computes the Classification Error Probability (CEP) matrices (Liang et al., 2023) used in the bias-corrected three-step estimation of Latent Class/Profile Analysis with Covariates.

Usage

get.CEP(P.Z.Xns, time.cross = TRUE)

Value

A named list of length $T$. Each element is an $L \times L$ matrix:

Row $l$: true latent class;
Column $l'$: individuals assigned to class $l'$;
Entry $(l, l')$: estimated $P(\text{assigned class} = l' \mid \text{true class} = l)$.

When time.cross = TRUE, all matrices in the list are identical. Names are "t1", "t2", ..., "tT".

Arguments

P.Z.Xns

A list of length $T$ (number of time points). Each element is an $N \times L$ matrix of posterior probabilities $P(Z_{it} = l \mid X_i)$ from the first-step model.

Rows correspond to individuals ($i = 1, \dots, N$);
Columns correspond to latent classes ($l = 1, \dots, L$);
Each row must sum to 1.

The list must be ordered chronologically (e.g., time 1 to $T$).

time.cross

Logical. If TRUE (default), returns a list where every element is the same pooled CEP matrix (averaged across all time points). If FALSE, returns time-specific CEP matrices.

Details

The CEP matrix at time $t$ gives the probability that an individual truly belongs to latent class $l'$ given that they were assigned (via modal assignment) to class $l$ at time $t$.

Formally, for time point $t$: $$ \mathrm{CEP}_t(l, l') = P(Z_t = l \mid \hat{Z}_t = l') = \frac{ \sum_{i:\,\hat{z}_{it} = l'} P(Z_{it} = l \mid X_i) }{ N \, \hat{\pi}_{tl} } $$

where:

$Z_{it}$ is the true latent class of individual $i$ at time $t$;
$P(Z_{it} = l \mid X_i)$ is the posterior probability from the first-step model;
$\hat{z}_{it} = \arg\max_l P(Z_{it} = l' \mid X_i)$ is the modal (most likely) assigned class;
$\hat{\pi}_{tl} = \frac{1}{N} \sum_{i=1}^N I(\hat{z}_{it} = l)$ is the observed proportion assigned to class $l$ at time $t$;
$N$ is the total sample size.

If time.cross = TRUE (default), a single pooled CEP matrix is computed by aggregating counts across all time points. This assumes the classification error structure is invariant over time (i.e., measurement invariance), as in Liang et al. (2023). The same pooled matrix is then returned for every time point.

References

Liang, Q., la Torre, J. d., & Law, N. (2023). Latent Transition Cognitive Diagnosis Model With Covariates: A Three-Step Approach. Journal of Educational and Behavioral Statistics, 48(6), 690-718. https://doi.org/10.3102/10769986231163320

Examples

Run this code

# Simulate posterior probabilities for 2 time points, 3 classes, 100 individuals
set.seed(123)
N <- 100; L <- 3; times <- 2
P.Z.Xns <- replicate(times,
  t(apply(matrix(runif(N * L), N, L), 1, function(x) x / sum(x))),
  simplify = FALSE)

# Compute time-specific CEP matrices
cep_time_specific <- get.CEP(P.Z.Xns, time.cross = FALSE)

# Compute time-invariant (pooled) CEP matrix
cep_pooled <- get.CEP(P.Z.Xns, time.cross = TRUE)

Run the code above in your browser using DataLab