msm: Multi-state Markov and hidden Markov models in continuous time

Description

Fit a continuous-time Markov or hidden Markov multi-state model by maximum likelihood. Observations of the process can be made at arbitrary times, or the exact times of transition between states can be known. Covariates can be fitted to the Markov chain transition intensities or to the hidden Markov observation process.

Usage

msm ( formula, subject=NULL, data = list(), qmatrix, gen.inits = FALSE,
      ematrix=NULL, hmodel=NULL, obstype=NULL, obstrue=NULL,
      covariates = NULL, covinits = NULL, constraint = NULL,
      misccovariates = NULL, misccovinits = NULL, miscconstraint = NULL,
      hcovariates = NULL, hcovinits = NULL, hconstraint = NULL, hranges=NULL,
      qconstraint=NULL, econstraint=NULL, initprobs = NULL,
      est.initprobs=FALSE, initcovariates = NULL, initcovinits = NULL,
      death = FALSE, exacttimes = FALSE, censor=NULL,
      censor.states=NULL, pci=NULL, cl = 0.95, fixedpars = NULL, center=TRUE,
      opt.method="optim", hessian=NULL, use.deriv=TRUE,
      use.expm=TRUE, analyticp=TRUE, na.action=na.omit, ... )

Arguments

formula

A formula giving the vectors containing the observed states and the corresponding observation times. For example,

state ~ time

Observed states should be in the set 1, ..., n, where n is the number o

subject

Vector of subject identification numbers for the data specified by formula. If missing, then all observations are assumed to be on the same subject. These must be sorted so that all observations on the same subject are adjacent.

data

Optional data frame in which to interpret the variables supplied in formula, subject, covariates, misccovariates, hcovariates, obstype and obstrue.

qmatrix

Matrix which indicates the allowed transitions in the continuous-time Markov chain, and optionally also the initial values of those transitions. If an instantaneous transition is not allowed from state $r$ to state $s$, then qmatrix

gen.inits

If TRUE, then initial values for the transition intensities are generated automatically using the method in crudeinits.msm. The non-zero entries of the supplied qmatrix

ematrix

If misclassification between states is to be modelled, this should be a matrix of initial values for the misclassification probabilities. The rows represent underlying states, and the columns represent observed states. If an observation of

hmodel

Specification of the hidden Markov model. This should be a list of return values from the constructor functions described in the hmm-dists help page. Each element of the list corresponds to

obstype

A vector specifying the observation scheme for each row of the data. This can be included in the data frame data along with the state, time, subject IDs and covariates. Its elements should be either 1, 2 or 3, meaning as follows

obstrue

A vector of logicals (TRUE or FALSE) or numerics (1 or 0) specifying which observations (TRUE, 1) are observations of the underlying state without error, and which (FALSE, 0) are realisations of a

covariates

A formula or a list of formulae representing the covariates on the transition intensities via a log-linear model. If a single formula is supplied, like

covariates = ~ age + sex + treatment

then these covariates are assumed t

covinits

Initial values for log-linear effects of covariates on the transition intensities. This should be a named list with each element corresponding to a covariate. A single element contains the initial values for that covariate on each transition

constraint

A list of one numeric vector for each named covariate. The vector indicates which covariate effects on intensities are constrained to be equal. Take, for example, a model with five transition intensities and two covariates. Specifying constrai

misccovariates

A formula representing the covariates on the misclassification probabilities, analogously to covariates, via multinomial logistic regression. Only used if the model is specified using ematrix, rather than hmodel

misccovinits

Initial values for the covariates on the misclassification probabilities, defined in the same way as covinits. Only used if the model is specified using ematrix.

miscconstraint

A list of one vector for each named covariate on misclassification probabilities. The vector indicates which covariate effects on misclassification probabilities are constrained to be equal, analogously to constraint. Only use

hcovariates

List of formulae the same length as hmodel, defining any covariates governing the hidden Markov outcome models. The covariates operate on a suitably link-transformed linear scale, for example, log scale for a Poisson outcome mode

hcovinits

Initial values for the hidden Markov model covariate effects. A list of the same length as hcovariates. Each element is a vector with initial values for the effect of each covariate on that state. For example, the above hco

hconstraint

A named list. Each element is a vector of constraints on the named hidden Markov model parameter. The vector has length equal to the number of times that class of parameter appears in the whole model.

For example consider the three-state hidd

hranges

Range constraints for hidden Markov model parameters. Supplied as a named list, with each element corresponding to the named hidden Markov model parameter. This element is itself a list with two elements, vectors named "lower" and "uppe

qconstraint

A vector of indicators specifying which baseline transition intensities are equal. For example,

qconstraint = c(1,2,3,3)

constrains the third and fourth intensities to be equal, in a model with four allowed instantaneous tra

econstraint

A similar vector of indicators specifying which baseline misclassification probabilities are constrained to be equal. Only used if the model is specified using ematrix, rather than hmodel.

initprobs

Only used in hidden Markov models. Underlying state occupancy probabilities at each subject's first observation. Can either be a vector of $nstates$ elements with common probabilities to all subjects, or a $nsubjects$ by $nstates$ matrix of

est.initprobs

Only used in hidden Markov models. If TRUE, then the underlying state occupancy probabilities at the first observation will be estimated, starting from a vector of initial values supplied in the initprobs argument

initcovariates

Formula representing covariates on the initial state occupancy probabilities, via multinomial logistic regression. The linear effects of these covariates, observed at the individual's first observation time, operate on the log ratio of the st

initcovinits

Initial values for the covariate effects initcovariates. A named list with each element corresponding to a covariate, as in covinits. Each element is a vector with (1 - number of states) elements, containing the init

death

Vector of indices of absorbing states whose time of entry is known exactly, but the individual is assumed to be in an unknown transient state ("alive") at the previous instant. This is the usual situation for times of death in chronic disease

censor

A state, or vector of states, which indicates censoring. Censoring means that the observed state is known only to be one of a particular set of states. For example, censor=999 indicates that all observations of 999 i

censor.states

Specifies the underlying states which censored observations can represent. If censor is a single number (the default) this can be a vector, or a list with one element. If censor is a vector with more than one element

pci

Model for piecewise-constant intensities. Vector of cut points defining the times, since the start of the process, at which intensities change for all subjects. For example

pci = c(5, 10)

specifies that the intensity c

exacttimes

By default, the transitions of the Markov process are assumed to take place at unknown occasions in between the observation times. If exacttimes is set to TRUE, then the observation times are assumed to represent the

Width of symmetric confidence intervals for maximum likelihood estimates, by default 0.95.

fixedpars

Vector of indices of parameters whose values will be fixed at their initial values during the optimisation. These are given in the order: transition intensities (reading across rows of the transition matrix), covariates on intensities (ordered

center

If TRUE (the default, unless fixedpars=TRUE) then covariates are centered at their means during the maximum likelihood estimation. This usually improves stability of the numerical optimisation.

opt.method

If "optim", "nlm" or "bobyqa", then the corresponding R function will be used for maximum likelihood estimation. optim is the default. "bobyqa" requires the package minqa to be install

hessian

If TRUE then standard errors and confidence intervals are obtained from a numerical estimate of the Hessian (the observed information matrix). This is the default when maximum likelihood estimation is performed. If all parameter

use.deriv

If TRUE then analytic first derivatives are used in the optimisation of the likelihood, when an appropriate quasi-Newton optimisation method, such as BFGS, is being used. Note that the default for

use.expm

If TRUE then any matrix exponentiation needed to calculate the likelihood is done using the expm package. Otherwise the original routines used in msm 1.2.4 and earlier are used. Set to FALSE for

analyticp

By default, the likelihood for certain simpler 3, 4 and 5 state models is calculated using an analytic expression for the transition probability (P) matrix. For all other models, matrix exponentiation is used to obtain P. To revert to the orig

na.action

What to do with missing data: either na.omit to drop it and carry on, or na.fail to stop with an error. Missing data includes all NAs in the states, times, subject or obstrue, all NAs at the

...

Optional arguments to the general-purpose Roptimisation routine, optim by default. Useful options for optim include method="BFGS" for using a qu

Value

To obtain summary information from models fitted by the msm function, it is recommended to use extractor functions such as qmatrix.msm, pmatrix.msm, sojourn.msm, msm.form.qoutput. These provide estimates and confidence intervals for quantities such as transition probabilities for given covariate values.
For advanced use, it may be necessary to directly use information stored in the object returned by msm. This is documented in the help page msm.object.
Printing a msm object by typing the object's name at the command line implicitly invokes print.msm. This formats and prints the important information in the model fit, and also returns that information in an R object. This includes estimates and confidence intervals for the transition intensities and (log) hazard ratios for the corresponding covariates. When there is a hidden Markov model, the chief information in the hmodel component is also formatted and printed. This includes estimates and confidence intervals for each parameter.

Details

For full details about the methodology behind the msm package, refer to the PDF manual msm-manual.pdf in the doc subdirectory of the package. This includes a tutorial in the typical use of msm. The paper by Jackson (2011) in Journal of Statistical Software presents the material in this manual in a more concise form.

msm was designed for fitting continuous-time Markov models, processes where transitions can occur at any time. These models are defined by intensities, which govern both the time spent in the current state and the probabilities of the next state. In discrete-time models, transitions are known in advance to only occur at multiples of some time unit, and the model is purely governed by the probability distributions of the state at the next time point, conditionally on the state at the current time. These can also be fitted in msm, assuming that there is a continuous-time process underlying the data. Then the fitted transition probability matrix over one time period, as returned by pmatrix.msm(...,t=1) is equivalent to the matrix that governs the discrete-time model. However, these can be fitted more efficiently using multinomial logistic regression, for example, using multinom from the R package nnet (Venables and Ripley, 2002).

For simple continuous-time multi-state Markov models, the likelihood is calculated in terms of the transition intensity matrix $Q$. When the data consist of observations of the Markov process at arbitrary times, the exact transition times are not known. Then the likelihood is calculated using the transition probability matrix $P(t) = \exp(tQ)$, where $\exp$ is the matrix exponential. If state $i$ is observed at time $t$ and state $j$ is observed at time $u$, then the contribution to the likelihood from this pair of observations is the $i,j$ element of $P(u - t)$. See, for example, Kalbfleisch and Lawless (1985), Kay (1986), or Gentleman et al. (1994).

For hidden Markov models, the likelihood for an individual with $k$ observations is calculated directly by summing over the unknown state at each time, producing a product of $k$ matrices. The calculation is a generalisation of the method described by Satten and Longini (1996), and also by Jackson and Sharples (2002), and Jackson et al. (2003).

There must be enough information in the data on each state to estimate each transition rate, otherwise the likelihood will be flat and the maximum will not be found. It may be appropriate to reduce the number of states in the model, the number of allowed transitions, or the number of covariate effects, to ensure convergence. Hidden Markov models, and situations where the value of the process is only known at a series of snapshots, are particularly susceptible to non-identifiability, especially when combined with a complex transition matrix. Choosing an appropriate set of initial values for the optimisation can also be important. For flat likelihoods, 'informative' initial values will often be required. See the PDF manual for other tips.

References

Jackson, C.H. (2011). Multi-State Models for Panel Data: The msm Package for R., Journal of Statistical Software, 38(8), 1-29. URL http://www.jstatsoft.org/v38/i08/.

Kalbfleisch, J., Lawless, J.F., The analysis of panel data under a Markov assumption Journal of the Americal Statistical Association (1985) 80(392): 863--871.

Kay, R. A Markov model for analysing cancer markers and disease states in survival studies. Biometrics (1986) 42: 855--865.

Gentleman, R.C., Lawless, J.F., Lindsey, J.C. and Yan, P. Multi-state Markov models for analysing incomplete disease history data with illustrations for HIV disease. Statistics in Medicine (1994) 13(3): 805--821.

Satten, G.A. and Longini, I.M. Markov chains with measurement error: estimating the 'true' course of a marker of the progression of human immunodeficiency virus disease (with discussion) Applied Statistics 45(3): 275-309 (1996)

Jackson, C.H. and Sharples, L.D. Hidden Markov models for the onset and progression of bronchiolitis obliterans syndrome in lung transplant recipients Statistics in Medicine, 21(1): 113--128 (2002).

Jackson, C.H., Sharples, L.D., Thompson, S.G. and Duffy, S.W. and Couto, E. Multi-state Markov models for disease progression with classification error. The Statistician, 52(2): 193--209 (2003)

Venables, W.N. and Ripley, B.D. (2002) Modern Applied Statistics with S, second edition. Springer.

Examples

Run this code

### Heart transplant data
### For further details and background to this example, see
### Jackson (2011) or the PDF manual in the doc directory.
print(cav[1:10,])
twoway4.q <- rbind(c(-0.5, 0.25, 0, 0.25), c(0.166, -0.498, 0.166, 0.166),
c(0, 0.25, -0.5, 0.25), c(0, 0, 0, 0))
statetable.msm(state, PTNUM, data=cav)
crudeinits.msm(state ~ years, PTNUM, data=cav, qmatrix=twoway4.q)
cav.msm <- msm( state ~ years, subject=PTNUM, data = cav,
                 qmatrix = twoway4.q, death = 4, 
                 control = list ( trace = 2, REPORT = 1 )  )
cav.msm
qmatrix.msm(cav.msm)
pmatrix.msm(cav.msm, t=10)
sojourn.msm(cav.msm)

Run the code above in your browser using DataLab