Learn R Programming

ActiveLearning4SPM (version 0.1.0)

fit_pHMM: Fit a Partially Hidden Markov Model (pHMM)

Description

Fits a partially hidden Markov model (pHMM) to multivariate time series observations \(y\) with partially observed process states \(x\), using a constrained Baum-Welch algorithm. The function allows the user to provide custom initial parameters, and supports constraints on known means and/or covariances, as well as equal or diagonal covariance structures.

Usage

fit_pHMM(
  y,
  xlabeled,
  nstates,
  ppi_start = NULL,
  A_start = NULL,
  mean_start,
  covariance_start = NULL,
  known_mean = NULL,
  known_covariance = NULL,
  equal_covariance = FALSE,
  covariance_structure = "full",
  max_iter = 200,
  tol = 0.001,
  verbose = FALSE
)

Value

A list with components:

  • y, xlabeled: the input data.

  • log_lik, log_lik_vec: final and trace of log-likelihood.

  • iter: number of EM iterations performed.

  • logB, log_alpha, log_beta, log_gamma, log_xi: posterior quantities from the Baum-Welch algorithm.

  • logAhat, mean_hat, covariance_hat, log_pi_hat: estimated model parameters.

  • AIC, BIC: information criteria for model selection.

Arguments

y

A numeric matrix of dimension \(T \times d\), where each row corresponds to a \(d\)-dimensional observation at time \(t\).

xlabeled

An integer vector of length \(T\) with partially observed states. Known states must be integers in \(1, \ldots, N\); unknown states should be coded as NA.

nstates

Integer. The total number of hidden states to fit.

ppi_start

Numeric vector of length nstates giving the initial state distribution. If NULL, defaults to c(1,0,...,0).

A_start

Numeric nstates \(\times\) nstates transition probability matrix. If NULL, defaults to a transition matrix with diagonal entries equal to 1-0.01*(nstates-1) and all off-diagonal entries equal to 0.01.

mean_start

List of length nstates containing numeric mean vectors for the emission distributions.

covariance_start

List of covariance matrices for the emission distributions. Must be of length nstates, unless equal_covariance = TRUE, in which case it must be of length 1. If NULL, defaults to identity matrices.

known_mean

Optional list of known mean vectors. Use NA for unknown elements.

known_covariance

Optional list of known covariance matrices. Use NA for unknown elements.

equal_covariance

Logical. If TRUE, all states are constrained to share a common covariance matrix.

covariance_structure

Character string specifying the covariance structure. Either "full" (default) or "diagonal".

max_iter

Maximum number of EM iterations. Default is 200.

tol

Convergence tolerance for log-likelihood and parameter change. Default is 1e-3.

verbose

Logical. If TRUE, prints log-likelihood progress at each iteration.

References

Capezza, C., Lepore, A., & Paynabar, K. (2025). Stream-Based Active Learning for Process Monitoring. Technometrics. <doi:10.1080/00401706.2025.2561744>.

Examples

Run this code
library(ActiveLearning4SPM)
set.seed(123)
dat <- simulate_stream(T0 = 100, TT = 500)
y <- dat$y
xlabeled <- dat$x
d <- ncol(dat$y)
xlabeled[sample(1:600, 300)] <- NA
out <- fit_pHMM(y = y,
                xlabeled = xlabeled,
                nstates = 3,
                mean_start = list(rep(0, d), rep(1, d), rep(-1, d)),
                equal_covariance = TRUE)
out$AIC

Run the code above in your browser using DataLab