Learn R Programming

HMMHSMM (version 0.1.0)

findmleHMMnostarting: Multiple Initialization Maximum Likelihood Estimation for Hidden Markov Models

Description

Fits a Hidden Markov Model (HMM) by repeatedly initializing observation and transition parameters and selecting the fit with the highest log-likelihood. This approach helps avoid convergence to poor local optima. For the generalized extreme value (GEV) distribution, starting values are generated from repeated maximum likelihood fits on random data subsets.

Usage

findmleHMMnostarting(J, x, obsdist, no.initials = 50, EM = FALSE,
                     verbose = TRUE, seed = NULL, ...)

Value

A list corresponding to the best fit across all initializations, containing:

estimate

List of estimated HMM parameters, including state-dependent observation parameters and transition probabilities.

loglik

The maximized log-likelihood value.

AIC

The Akaike Information Criterion for the fitted model.

BIC

Bayesian Information Criteria for the fitted model.

hessian

Optional. The Hessian matrix at the maximum likelihood estimates (returned if EM = FALSE).

Arguments

J

Integer. The number of hidden states in the HMM. Must be strictly greater than 1.

x

Numeric vector. The observed data sequence.

obsdist

Character string. The observation distribution. Supported distributions are: "norm", "pois", "weibull", "zip", "nbinom", "zinb", "exp", "gamma", "lnorm", "gev", "ZInormal", "ZIgamma".

no.initials

Integer. The number of random initializations to attempt. Defaults to 50.

EM

Logical. If TRUE, uses an EM-based semi-Markov approximation for estimation. If FALSE, maximizes the likelihood directly using nlm. Defaults to FALSE.

verbose

Logical. If TRUE, progress messages are printed to the console. Default is TRUE.

seed

Integer or NULL. Random seed for reproducibility. Default is NULL.

...

Further arguments to be passed to findmleHMM in the case of EM=TRUE.

Author

Aimee Cody

Details

This function automates multiple trials of findmleHMM with randomized starting values, returning the fit that achieves the highest log-likelihood.

  • For most observation distributions, starting values are generated via clusterHMM.

  • For the GEV distribution, starting values are drawn from repeated fits of evd::fgev on random data segments. Up to 20,000 attempts are made, and a warning is issued if fewer than 1000 valid estimates are obtained.

During each iteration:

  1. Observation parameters are perturbed slightly to encourage exploration.

  2. A transition matrix Pi is drawn from a random uniform distribution with added self-transition bias.

  3. The HMM is estimated via findmleHMM.

  4. If the resulting log-likelihood exceeds the current best, the model is updated.

At the end of all iterations, the best-fitting model is returned. When verbose = TRUE, iteration numbers and error messages are displayed during the fitting process.

See Also

findmleHMM for fitting an HMM with user-supplied starting values. generateHMM for simulating HMM data. findmleHSMMnostarting for the non-initialised estimation of hidden semi-Markov models.

Examples

Run this code
set.seed(123)
J <- 3
Pi <- matrix(c(0.7, 0.2, 0.1,
               0.1, 0.8, 0.1,
               0.2, 0.3, 0.5), nrow = 3, byrow = TRUE)
obspar <- list(mean = c(-2, 0, 3),
               sd   = c(0.5, 1, 1.5))
x <- generateHMM(n = 200, J = J, Pi = Pi, obsdist = "norm", obspar = obspar)$x

# \donttest{
fit <- findmleHMMnostarting(J = J, x = x, obsdist = "norm",
                            no.initials = 30)

fit$loglik
fit$estimate

fit_silent <- findmleHMMnostarting(J = J, x = x, obsdist = "norm",
                                   no.initials = 30, verbose = FALSE)
# }

Run the code above in your browser using DataLab