msm ( formula, subject=NULL, data = list(), qmatrix, gen.inits = FALSE,
ematrix=NULL, hmodel=NULL, obstype=NULL, obstrue=NULL,
covariates = NULL, covinits = NULL, constraint = NULL,
misccovariates = NULL, misccovinits = NULL, miscconstraint = NULL,
hcovariates = NULL, hcovinits = NULL, hconstraint = NULL,
qconstraint=NULL, econstraint=NULL, initprobs = NULL,
est.initprobs=FALSE, initcovariates = NULL, initcovinits = NULL,
death = FALSE, exacttimes = FALSE, censor=NULL,
censor.states=NULL, cl = 0.95, fixedpars = NULL, center=TRUE,
opt.method=c("optim","nlm"), hessian=TRUE, use.deriv=FALSE,
deriv.test=FALSE, analyticp=TRUE, ... )
state ~ time
Observed states should be in the set 1, ..., n
,
where n
is the numb
formula
. If missing, then all observations
are assumed to be on the same subject. These must be sorted so that
all observations on the same subject are adjacent.formula
, subject
, covariates
,
misccovariates
, hcovariates
, obstype
and
obsmisc
.qmatrix
should have
$(r,s)$ entry 0, otherwise it should be non-zero. Any
diagonal entTRUE
, then initial values for the
transition intensities are estimated by assuming that the data
represent the exact transition times of the process. The non-zero
entries of the supplied qmatrix
are assumed to indihmm-dists
help
page. Each element of the list corresponds data
along with the state, time, subject IDs and covariates. Its
elements should be either 1, 2 or 3, meaning as followsTRUE
or FALSE
) or
numerics (1 or 0) specifying which observations (TRUE
, 1) are
observations of the underlying state without error, and which
(FALSE
, 0) are realisations of a ~ age + sex + treatment
constrai
covariates
,
via multinomial logistic regression. Only used if the model is
specified using ematrix
, rather than hmode
covinits
.
Only used if the model is specified using ematrix
.constraint
.
Only usehmodel
, defining any covariates
governing the hidden Markov outcome models. The covariates operate
on a suitably link-transformed linear scale, for example, log scale
for a Poisson outcome modehcovariates
. Each element is a vector with
initial values for the effect of each covariate on that state. For example, the
above hco
For example consider the three-state hid
qconstraint = c(1,2,3,3)
constrains the third and fourth intensities to be equal, in a model with four allowed instantaneous
ematrix
, rather
than hmodel
.est.initprobs
), then this defaults to equal probability foTRUE
, then the underlying state
occupancy probabilities at the first observation will be estimated,
starting from initial values taken from the initprobs
argument. Structural zeroes are allowed: if any of these ininitcovariates
. A named list with each element corresponding
to a covariate, as in covinits
. Each element is a vector with
(1 - number of states) elements, containing the initcensor=999
indicates that all observations of 999
icensor
is a single number (the default) this
can be a vector, or a list with one element. If censor
is a
vector with more than one elementexacttimes
is
set to TRUE
, then all observation times are assumed to
representTRUE
(the default) then covariates are centered at
their means during the maximum likelihood estimation. This usually
improves convergence.optim
is the default.TRUE
(the default) then the Hessian matrix is
computed at the maximum likelihood estimates, to obtain standard
errors and confidence intervals.TRUE
then analytic first derivatives are
used in the optimisation of the likelihood, when an appropriate
quasi-Newton optimisation method, such as BFGS, is being used. Note
that the default for
TRUE
, then analytic and numeric derivatives are
computed and compared at the initial values, and no optimisation is
performed.analy
msm
, with components:msm
.logbaseline
, is a matrix containing the estimated
transition intensities on the log scale with any covariates fixed at
their means in the data. Each remaining component is a matrix giving the linear
effects of the labelled covariate on the matrix of log
intensities. To extract an estimated intensity matrix on the natural
scale, at an arbitrary combination of covariate values, use the
function qmatrix.msm
.Qmatrices
.cl
argument.logitbaseline
, is the estimated misclassification probability
matrix (expressed as as log odds relative to the probability of the
true state) with any covariates fixed at their means in the data. Each
remaining component is a matrix giving the linear
effects of the labelled covariate on the matrix of logit
misclassification probabilities. To extract an estimated misclassification
probability matrix on the natural scale, at an arbitrary combination
of covariate values, use the function ematrix.msm
.Ematrices
.cl
argument. mean
= estimated mean sojourn times in the transient states,
with covariates fixed at their means.
se
= corresponding standard errors.
optim
. Transition intensities are on
the log scale and misclassification probabilities are given as log
odds relative to the probability of the true state.estimates
which were fixed during
the maximum likelihood estimation.estimates
.estimates.t
optim
or nlm
,
giving information about the results of the optimisation.msm
object by typing the object's name at the
command line implicitly invokes print.msm
. This
formats and prints the important information in the model fit.
This includes the fitted transition intensity matrix, matrices
containing covariate effects on intensities, and mean sojourn times
from a fitted msm
model. When there is a hidden Markov
model, the chief information in the
hmodel
component is also formatted and printed. This includes
estimates and confidence intervals for each
parameter. To extract summary information from the fitted model, it is
recommended to use the more flexible extractor functions, such as
qmatrix.msm
, pmatrix.msm
,
sojourn.msm
, instead of directly reading from list
components of msm
objects.
Note that multinom
from the
R package
For simple continuous-time multi-state Markov models, the likelihood is calculated in terms of the transition intensity matrix $Q$. When the data consist of observations of the Markov process at arbitrary times, the exact transition times are not known. Then the likelihood is calculated using the transition probability matrix $P(t) = \exp(tQ)$, where $\exp$ is the matrix exponential. If state $i$ is observed at time $t$ and state $j$ is observed at time $u$, then the contribution to the likelihood from this pair of observations is the $i,j$ element of $P(u - t)$. See, for example, Kalbfleisch and Lawless (1985), Kay (1986), or Gentleman et al. (1994).
For hidden Markov models, the likelihood for an individual with $k$ observations is calculated directly by summing over the unknown state at each time, producing a product of $k$ matrices. The calculation is a generalisation of the method described by Satten and Longini (1996), and also by Jackson and Sharples (2002), and Jackson et al. (2003).
There must be enough information in the data on each state to estimate each transition rate, otherwise the likelihood will be flat and the maximum will not be found. It may be appropriate to reduce the number of states in the model, the number of allowed transitions, or the number of covariate effects, to ensure convergence. Hidden Markov models, and situations where the value of the process is only known at a series of snapshots, are particularly susceptible to non-identifiability, especially when combined with a complex transition matrix.
Choosing an appropriate set of initial values for the optimisation can
also be important. For flat likelihoods, 'informative' initial values
will often be required.
Users upgrading from versions of qmatrix
and covinits
arguments instead of inits
, and qmatrix
is no longer a
matrix of 0/1
indicators. See the appendix to the PDF manual
or the NEWS file in the top-level installation directory for a full
list of changes.
Jackson, C.H., Sharples, L.D., Thompson, S.G. and Duffy, S.W. and Couto, E. Multi-state Markov models for disease progression with classification error. The Statistician, 52(2): 193--209 (2003)
Venables, W.N. and Ripley, B.D. (2002) Modern Applied Statistics with S, second edition. Springer.
simmulti.msm
, plot.msm
,
summary.msm
, qmatrix.msm
,
pmatrix.msm
, sojourn.msm
.### Heart transplant data
### For further details and background to this example, see
### the PDF manual in the doc directory.
data(heart)
print(heart[1:10,])
twoway4.q <- rbind(c(-0.5, 0.25, 0, 0.25), c(0.166, -0.498, 0.166, 0.166),
c(0, 0.25, -0.5, 0.25), c(0, 0, 0, 0))
statetable.msm(state, PTNUM, data=heart)
crudeinits.msm(state ~ years, PTNUM, data=heart, qmatrix=twoway4.q)
heart.msm <- msm( state ~ years, subject=PTNUM, data = heart,
qmatrix = twoway4.q, death = 4,
control = list ( trace = 2, REPORT = 1 ) )
heart.msm
qmatrix.msm(heart.msm)
pmatrix.msm(heart.msm, t=10)
sojourn.msm(heart.msm)
Run the code above in your browser using DataLab