run_hmm: Hidden Markov Model (HMM) for Path Dependence (Counts I and C)

Description

Fits a univariate time-series Hidden Markov Model (HMM) with Poisson emissions for the count variables I and C using depmixS4. The estimated state sequence is exported and the fit object is saved to disk.

Usage

run_hmm(DT, nstates = 3, seed = NULL)

Value

If the optimization succeeds, a list with components:

fit: the fitted "depmix" model object.
states: integer vector of inferred latent states (one per time point).

If fitting fails (e.g., non-convergence), the function returns NULL.

Arguments

DT: A data.frame or data.table containing at least the columns I and C, interpreted as non-negative count series observed over time.
nstates: Integer; number of latent Markov states to fit in the HMM (default is 3).
seed: Integer or NULL; optional seed for reproducibility. If NULL (default), no seed is set and results may vary between runs.

Details

The model is specified via depmixS4::depmix() as a multivariate Poisson HMM with two observed series:

I ~ 1
C ~ 1

and nstates hidden regimes. The function:

Builds a data frame with columns I and C.
Constructs the HMM with Poisson emission distributions for both series.
Optionally sets a random seed if the seed argument is provided.
Fits the model with fit(mod, verbose = FALSE) wrapped in try() to avoid stopping on optimization failures.
If fitting succeeds, extracts the posterior state sequence via depmixS4::posterior().

The function assumes that two global character scalars are defined:

dir_csv: directory where the state sequence CSV will be written.
dir_out: directory where the fitted HMM object RDS will be saved.

A CSV file named "hmm_states.csv" is written to dir_csv with columns t (time index) and state (most probable state). The fitted HMM object is saved as "hmm_fit.rds" in dir_out.

Examples

Run this code

# \donttest{
library(data.table)

# 1. Create dummy data (Only 'I' and 'C' counts are required by this function)
DT <- data.table(
  I = rpois(50, lambda = 4),
  C = rpois(50, lambda = 3)
)

# 2. Define global paths using tempdir() (Fixes CRAN policy)
# run_hmm expects these variables to exist in the global environment
tmp_dir <- tempdir()
dir_csv <- file.path(tmp_dir, "csv")
dir_out <- file.path(tmp_dir, "hmm")

dir.create(dir_csv, showWarnings = FALSE, recursive = TRUE)
dir.create(dir_out, showWarnings = FALSE, recursive = TRUE)

# 3. Run the function
# Using nstates=2 for a faster example check
res_hmm <- run_hmm(DT, nstates = 2)

# Inspect result if successful
if (!is.null(res_hmm)) {
  print(table(res_hmm$states))
}
# }

Run the code above in your browser using DataLab