rhmm: Simulate discrete data from a non-parametric hidden Markov model.

Description

Simulates one or more replicates of discrete data from a model such as is fitted by the function hmm().

Usage

rhmm(model,...,nsim,verbose=FALSE)
# S3 method for default
rhmm(model, ..., nsim=1, verbose=FALSE, ylengths,
                       nafrac=NULL, fep=NULL, tpm, Rho, ispd=NULL, yval=NULL,
                       drop=TRUE, forceNumeric=TRUE)
# S3 method for hmm.discnp
rhmm(model, ..., nsim=1, verbose=FALSE, inMiss=TRUE,
                          fep=NULL, drop=TRUE, forceNumeric=TRUE)

Value

If nsim>1 or drop is FALSE then the value returned is a list of length nsim. Each entry of this list is in turn a list of the same length as ylengths, each component of which is an independent vector or matrix of simulated observations. The length or number of rows of component i of this list is equal to ylengths[i]. The values of the observations are entries of yval or of its

entries when yval is a list.

If nsim=1 and drop is TRUE then the (“outer”) list described above is replaced by its first and only entry

If the length of ylengths is 1 and drop is TRUE then each “inner” list described above is replaced by its first and only entry.

Arguments

model: An object of class hmm.discnp. This will have the form of a list specifying a hidden Markov model with discrete emissions and emission probabilities specified non-parametrically, i.e. by means of some form of table or tables. Usually this will be an object returned by hmm(). This argument is ignored by the default method.
...: Not used.
nsim: Integer scalar; the number of data sets to be simulated.
verbose: Logical scalar. If TRUE then the overall index of the simulated value that has been reached is printed out every 1000 iterations. Useful for reassurance when very “large” simulations are undertaken.
ylengths: Integer values vector specify the lengths (or number of rows in the bivariate setting) of the individual observation sequences constituting a data set.
nafrac: See misstify() for an explanation of this argument. If specified a fraction nafrac[[j]] of column j of the data will be randomly set equal to NA.
fep: “First entry present”. See misstify() for an explanation of this argument.
tpm: The transition probability matrix for the underlying hidden Markov chain(s). Note that the rows of tpm must sum to 1. Ignored if ncol(Rho)==1. Ignored by the hmm.discnp method and extracted from model.
Rho: An object specifying the probability distribution of the observations, given the state of the underlying hidden Markov chain. (I.e. the “emission” probabilities.) See hmm(). Note that Rho can be such that the number of states is 1, in which case the simulated data are i.i.d. from the single distribution specified by Rho. Ignored by the hmm.discnp method and extracted from model.
ispd: A vector specifying the initial state probability distribution of the chain. If this is not specified it is taken to be the stationary distribution of the chain, calculated from tpm. Ignored by the hmm.discnp method and extracted from model.
yval: Vector of possible values of the observations, or (in the bivariate setting) a list of two such vectors. If not supplied it is formed from the levels of the factor constituting the y column of Rho (univariate case) or from appropriate dimension names associated with Rho (bivariate case). Ignored by the hmm.discnp method.
drop: Logical scalar; if TRUE then lists of length 1 are replaced by their first entry. In particular if nsim is 1 and if drop is TRUE then the list to be returned by this function (see below) is replaced by its first and only entry. Also if ylengths is of length 1 (so that each entry of the returned value contains only a single sequence of of simulated observations) then each list of such sequences is replaced by its first and only entry.
inMiss: Logical scalar; if TRUE then missing values will be randomly inserted into the data in the fraction nafrac determined from object.
forceNumeric: Logical scalar; if TRUE then if all of the possible values of the observations can be interpreted as numeric (by as.numeric()) then they are so interpreted. That is, the value returned will consist of a collection of numeric sequences, rather than a collection of sequences of values of categorical variables.

Author

Rolf Turner r.turner@auckland.ac.nz

Examples

Run this code

# To do: one or more bivariate examples.
if (FALSE) {
    y <- list(linLandFlows$deciles,ftLiardFlows$deciles)
    fit <- hmm(y,K=3)
    simX <- rhmm(fit)
}