Learn R Programming

HMMHSMM (version 0.1.0)

generateHSMM: Generate Data from a Hidden Semi-Markov Model

Description

Simulates observations and hidden states from a Hidden Semi-Markov Model (HSMM) with specified observation and dwell time distributions.

Usage

generateHSMM(n, J, obsdist, dwelldist, obspar, dwellpar, Pi,
             delta = NULL, simtype = "nobs", shift = FALSE, seed = NULL)

Value

A list containing:

states

Numeric vector of the simulated hidden state sequence.

x

Numeric vector of the simulated observations.

N

Integer. The number of observations generated.

Arguments

n

Integer. The number of observations to generate (if simtype = "nobs") or the number of state sequences (if simtype = "nseq").

J

Integer. The number of hidden states in the model.

obsdist

Character string. The observation distribution. Supported distributions are: "norm", "pois", "weibull", "zip", "nbinom", "zinb", "exp", "gamma", "lnorm", "gev", "ZInormal", "ZIgamma".

dwelldist

Character string. The dwell time distribution. Supported distributions are: "pois", "nbinom", "betabinom".

obspar

List. Parameters for the observation distribution. Required parameters vary by distribution:

  • norm: mean, sd

  • pois: lambda

  • weibull: shape, scale

  • zip: pi, lambda

  • nbinom: size, mu

  • zinb: pi, size, mu

  • exp: rate

  • gamma: shape, rate

  • lnorm: meanlog, sdlog

  • gev: loc, scale, shape

  • ZInormal: mean, sd, pi

  • ZIgamma: shape, rate, pi

Each parameter should be a vector of length J with values for each state.

dwellpar

List. Parameters for the dwell time distribution. Required parameters vary by distribution:

  • pois: lambda, shift

  • nbinom: shift, size, mu

  • betabinom: size, alpha, beta, shift

Each parameter should be a vector of length J with values for each state.

Pi

Matrix. The J x J transition probability matrix between states. Rows must sum to 1.

delta

Numeric vector of length J. The initial state distribution. If NULL, the stationary distribution is computed from Pi.

simtype

Character string. Either "nobs" (generate n observations) or "nseq" (generate n state sequences). Default is "nobs".

shift

Logical. If TRUE, uses the shift parameter from dwellpar. If FALSE and no shift is provided in dwellpar, sets shift to 1 for all states. Default is FALSE.

seed

Integer or NULL. Random seed for reproducibility. Default is NULL.

Author

[Aimee Cody]

Details

This function simulates data from a Hidden Semi-Markov Model where:

  • Hidden states follow a Markov chain with transition matrix Pi

  • Each state has an associated dwell time distribution that determines how long the process remains in that state

  • Observations are generated from state-dependent distributions

The function supports multiple observation distributions including normal, Poisson, Weibull, zero-inflated Poisson (ZIP), negative binomial, zero-inflated negative binomial (ZINB), exponential, gamma, log-normal, generalized extreme value (GEV), zero-inflated normal and zero-inflated gamma.

Dwell time distributions include Poisson, negative binomial, and beta-binomial, all with optional shift parameters to ensure minimum dwell times.

When simtype = "nobs", the function generates exactly n observations. When simtype = "nseq", it generates n state sequences and the total number of observations depends on the realized dwell times.

See Also

generateHMM for Hidden Markov Models.

Examples

Run this code
# Example with 3 states, normal observations, Poisson dwell times
J <- 3
# HSMM transition matrix
Pi <- matrix(c(0.0, 0.6, 0.4,
               0.5, 0.0, 0.5,
               0.3, 0.7, 0.0), nrow = 3, byrow = TRUE)

# Observation parameters (normal distribution)
obspar <- list(
  mean = c(-2, 0, 3),
  sd = c(1, 1.5, 2)
)

# Dwell time parameters (Poisson distribution)
dwellpar <- list(
  lambda = c(3, 5, 4)
)

# Generate 100 observations
sim_data <- generateHSMM(n = 100, J = J, obsdist = "norm", dwelldist = "pois",
                        obspar = obspar, dwellpar = dwellpar, Pi = Pi)

# View the results
head(sim_data$x)        # observations
head(sim_data$states)   # hidden states
sim_data$N              # total number of observations

Run the code above in your browser using DataLab