fit_pHMM_auto: Automatic Initialization and Fitting of a Partially Hidden Markov Model (pHMM)

Description

Fits a partially hidden Markov model (pHMM) to multivariate time series observations \(y\) with partially observed states \(x\), using the constrained Baum-Welch algorithm. Unlike fit_pHMM, this function does not require user-specified initial parameters. Instead, it implements a customized initialization strategy designed for process monitoring with highly imbalanced classes, as described in the supplementary material of Capezza, Lepore, and Paynabar (2025).

Usage

fit_pHMM_auto(
  y = y,
  xlabeled = xlabeled,
  tol = 0.001,
  max_nstates = 5,
  ntry = 10
)

Value

A list with the same structure as returned by fit_pHMM:

y, xlabeled: the input data.
log_lik, log_lik_vec: final and trace of log-likelihood.
iter: number of EM iterations performed.
logB, log_alpha, log_beta, log_gamma, log_xi: posterior quantities from the Baum-Welch algorithm.
logAhat, mean_hat, covariance_hat, log_pi_hat: estimated model parameters.
AIC, BIC: information criteria for model selection.

Arguments

y: A numeric matrix of dimension \(T \times d\), where each row corresponds to a \(d\)-dimensional observation at time \(t\).
xlabeled: An integer vector of length \(T\) with partially observed states. Known states must be integers in \(1, \ldots, N\); unknown states should be coded as NA.
tol: Convergence tolerance for log-likelihood and parameter change. Default is 1e-3.
max_nstates: Maximum number of hidden states to consider during the initialization procedure. Default is 5.
ntry: Number of candidate initializations for each new state. Default is 10.

Details

The initialization procedure addresses the multimodality of the likelihood and the sensitivity of the Baum-Welch algorithm to starting values:

A one-state model (in-control process) is first fitted using robust estimators of location and scatter.
To introduce an additional state, candidate mean vectors are selected from observations that are least well represented by the current model. This is achieved by computing moving averages of the data over window lengths \(k = 1, \ldots, 9\), and then calculating the Mahalanobis distances of these smoothed points to existing state means.
The ntry observations with the largest minimum distances are retained as candidate initializations for the new state's mean.
For each candidate, a pHMM is initialized with:
- Existing means fixed to their previous estimates.
- The new state's mean set to the candidate vector.
- A shared covariance matrix fixed to the robust estimate from the in-control state.
- Initial state distribution \(\pi\) concentrated on the IC state.
- Transition matrix with diagonal entries \(1 - 0.01 (N-1)\) and off-diagonal entries \(0.01\).
Each initialized model is fitted with the Baum-Welch algorithm, and the one achieving the highest log-likelihood is retained.
This process is repeated until up to max_nstates states are considered.

This strategy leverages prior process knowledge (dominant in-control regime) and focuses the search on under-represented regions of the data space, which improves convergence and reduces sensitivity to random initialization.

References

Capezza, C., Lepore, A., & Paynabar, K. (2025). Stream-Based Active Learning for Process Monitoring. Technometrics. <doi:10.1080/00401706.2025.2561744>.

Supplementary Material, Section B: Initialization of the Partially Hidden Markov Model. Available at <https://doi.org/10.1080/00401706.2025.2561744>.

Examples

Run this code

library(ActiveLearning4SPM)
set.seed(123)
dat <- simulate_stream(T0 = 100, TT = 500)
y <- dat$y
xlabeled <- dat$x
d <- ncol(dat$y)
xlabeled[sample(1:600, 300)] <- NA
obj <- fit_pHMM_auto(y = y,
                     xlabeled = xlabeled,
                     tol = 1e-3,
                     max_nstates = 5,
                     ntry = 10)
obj$AIC

Run the code above in your browser using DataLab