hmm: Fit a hidden Markov model to discrete data.

Description

Uses the EM algorithm to perform a maximum likelihood fit of a hidden Markov model to discrete data where the observations come from one of a number of finite discrete distributions, depending on the (hidden) state of the Markov chain. These distributions are specified (non-parametrically) by a matrix $R = [\rho_{ij}]$ where $\rho_{ij} = P(Y = y_i | S = j)$, $Y$ being the observable random variable and $S$ being the hidden state.

Usage

hmm(y, yval=NULL, par0=NULL, K=NULL, rand.start=NULL, stationary=cis,
    mixture=FALSE, cis=TRUE, tolerance=1e-4, verbose=FALSE,
    itmax=200, crit='PCLL',keep.y=TRUE, data.name=NULL)

Arguments

A vector of discrete data or a list of such vectors; missing values are allowed.

yval

A vector of possible values for the data; it defaults to the sorted unique values of y. If any value of y does not match some value of yval, it will be treated as a MISSING VALUE.

par0

An optional (named) list of starting values for the parameters of the model, with components tpm (transition probability matrix) and Rho. The matrix Rho specifies the probability that the observati

The number of states in the hidden Markov chain; if par0 is not specified K MUST be; if par0 is specified, K is ignored.

Note that K=1 is acceptable; if K is 1 then

rand.start

A list consisting of two logical scalars which must be named tpm and Rho, if tpm is TRUE then the function init.all() chooses entries for then starting value of tpm at random; likewise for

stationary

Logical scalar. If TRUE then the model is fitted under the stationarity assumption, i.e. that the Markov chain was in steady state at the time that observations commenced. In this case the initial state probability distribution is

mixture

A logical scalar; if TRUE then a mixture model (all rows of the transition probability matrix are identical) is fitted rather than a general hidden Markov model.

cis

A logical scalar specifying whether there should be a constant initial state probability distribution. If stationary is FALSE and cis is FALSE then the initial state probab

tolerance

If the value of the quantity used for the stopping criterion is less than tolerance then the EM algorithm is considered to have converged.

verbose

A logical scalar determining whether to print out details of the progress of the EM algorithm.

itmax

If the convergence criterion has not been met by the time itmax EM steps have been performed, a warning message is printed out, and the function stops. A value is returned by the function anyway, with the logical component "converge

crit

The name of the stopping criterion, which must be one of "PCLL" (percent change in log-likelihood; the default), "L2" (L-2 norm, i.e. square root of sum of squares of change in coefficients), or "Linf" (L-infinity norm, i.e. maximum absolute va

keep.y

Logical scalar; should the observations y be returned as a component of the value of this function?

data.name

An identifying tag for the fit; if omitted, it defaults to the name of data set y as determined by deparse(substitute(y)).

Value

A list with components:
RhoThe fitted value of the probability matrix Rho specifying the distributions of the observations (the emission probabilities).
tpmThe fitted value of the transition probabilty matrix tpm.
ispdThe fitted initial state probability distribution, or a matrix of (trivial or deterministic) initial state probability distributions, one (column) of ispd for each observation sequence.
If stationary is TRUE then ispd is assumed to be the (unique) stationary distribution for the chain, and thereby determined by the transition probability matrix tpm. If stationary is FALSE and cis is TRUE then ispd is estimated as the mean of the vectors of conditional probabilities of the states, given the observation sequences, at time t=1.
If cis is FALSE then ispd is a matrix whose columns are trivial probability vectors, as described above.
log.likeThe final value of the log likelihood, as calculated through recursion.
convergedA logical scalar saying whether the algorithm satisfied the convergence criterion before the maximum of itmax EM steps was exceeded.
nstepThe number of EM steps performed by the algorithm.
yThe observations (argument y). Present only if keep.y is TRUE.
data.nameAn identifying tag, specified as an argument, or determined from the name of the argument y by deparse(substitute(y)).
stationaryThe argument stationary.
cisThe argument cis.

Notes

The package used to require the argument y to be a matrix in the case of multiple observed series. If the series were of unequal length the user was expected to pad them out with NAs to equalize the lengths. In the revision from version 0.0-9 to 0.1-0 this was changed, requiring the argument y to be (more sensibly) a list when there are multiple series. Unfortunately this help page was not correspondingly revised and the old ``matrix'' format was left as being the specified input format.

The old matrix format is still permitted (and the matrix is internally changed into a list) but this is deprecated. In some future version of hmm.discnp this possibility will be removed.

If K=1 then tpm, ispd, converged, and nstep are all set equal to NA in the list returned by this function.

The estimate of ispd in the non-stationary setting is inevitably very poor, unless the number of sequences of observations (the length of the list y) is very large. We have in effect ``less than one'' relevant observation for each such sequence.

Warnings

The ordering of the (hidden) states can be arbitrary. What the estimation procedure decides to call ``state 1'' may not be what you think of as being state number 1. The ordering of the states will be affected by the starting values used.

At some time in the future the (deprecated) option of being able to specify argument y as a matrix (in the setting in which there are multiple data series to which a single model is being fitted) will no longer be permitted.

Details

The hard work is done by a Fortran subroutine "recurse" (actually coded in Ratfor) which is dynamically loaded.

References

Rabiner, L. R., "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE vol. 77, pp. 257 -- 286, 1989.

Zucchini, W. and Guttorp, P., "A hidden Markov model for space-time precipitation," Water Resources Research vol. 27, pp. 1917-1923, 1991.

MacDonald, I. L., and Zucchini, W., "Hidden Markov and Other Models for Discrete-valued Time Series", Chapman & Hall, London, 1997.

Liu, Limin, "Hidden Markov Models for Precipitation in a Region of Atlantic Canada", Master's Report, University of New Brunswick, 1997.

Examples

Run this code

# See the help for sim.hmm() for how to generate y.num.
fit.num     <- hmm(y.num,K=2,verb=TRUE)
   fit.num.mix <- hmm(y.num,K=2,verb=TRUE,mixture=TRUE)
   print(fit.num[c("tpm","Rho")])
# Note that states 1 and 2 get swapped.

Run the code above in your browser using DataLab