msm ( formula, qmatrix, misc = FALSE, ematrix, inits, subject,
covariates = NULL, constraint = NULL, misccovariates = NULL,
miscconstraint = NULL, qconstraint=NULL, econstraint=NULL,
covmatch = "previous", initprobs = NULL,
data = list(), fromto = FALSE, fromstate, tostate, timelag,
death = FALSE, censor = NULL, censor.states = NULL, exacttimes = FALSE,
fixedpars = NULL, ... )
states ~ times
See fromto
for an alternative way to specify the
data. Observed states should
qmatrix
should have $(r,s)$ entry 1, otherwise
it should have $(r,s)$ entry 0. The diagonal of qmatrix
misc = TRUE
if misclassification between
observed and underlying states is to be modelled.misc == TRUE
) Matrix of indicators for the allowed
misclassifications.
The rows represent underlying states, and the columns represent
observed states.
If an observation of state $s$ is possible when the subject
- transition intensities (reading across first rows of intensity matrix, then second row ... )
- covariate effects on log transition inte
formula
. If missing, then all observations
are assumed to be on the same subject. These must be sorted so that
all observations on the same subject are adjac ~ age + sex + treatment
constraint = lis
covariates
.constraint
.qconstraint = c(1,2,3,3)
constrains the third and fourth intensities to be equal, in a model with four allowed instantaneous
"previous"
, then time-dependent covariate
values are taken from the observation at the start of the
transition. If "next"
, then the covariate value is taken from
the end of the transition.c(1, rep(0, nstates-1))
, that is, in state 1 with a
probability oTRUE
, then the data are given as three vectors,
from-state, to-state, time-difference,
representing the set of observed transitions between states, and the
time taken by each one. Otherwise, the data are given by fromto == TRUE
).fromto == TRUE
).fromstate
and tostate
(required if fromto == TRUE
).censor=999
,
indicates that all observations of 999 in thcensor
is a single number (the ddefault) this
can be a vector, or a list with one element. If censor
is a
vector with more than one elemenexacttimes
is
set to TRUE
, then the observation times are assumed to
represent theinits
vector, whose order is
specified above. This can be useful for building complex moptim
. Useful options include
method="BFGS"
for using a quasi-Newton optimisation
algorithm, which can often be fastermsm
, with components:logbaseline
, is a matrix containing the estimated
transition intensities on the log scale with any covariates fixed at
their means in the data. Each remaining component is a matrix giving the linear
effects of the labelled covariate on the matrix of log
intensities. To extract an estimated intensity matrix on the natural
scale, at an arbitrary combination of covariate values, use the
function qmatrix.msm
.Qmatrices
.logitbaseline
, is the estimated misclassification probability
matrix with any covariates fixed at their means
in the data. Each remaining component is a matrix giving the linear
effects of the labelled covariate on the matrix of logit
misclassification probabilities. To extract an estimated misclassification
probability matrix on the natural scale, at an arbitrary combination
of covariate values, use the function ematrix.msm
.Ematrices
. mean
= estimated mean sojourn times in the transient states,
with covariates fixed at their means.
se
= corresponding standard errors.
optim
, with intensities on the log scale
and misclassification probabilities on the logit scale.estimates
.For models without misclassification, the likelihood is calculated in terms of the transition intensity matrix $Q$. When the data consist of observations of the Markov process at arbitrary times, the exact transition times are not known. Then the likelihood is calculated using the transition probability matrix $P(t) = exp(tQ)$. If state $i$ is observed at time $t$ and state $j$ is observed at time $u$, then the contribution to the likelihood from this pair of observations is the $i,j$ element of $P(u - t)$. See, for example, Kay (1986), or Gentleman et al. (1994).
For models with misclassification, the likelihood for an individual with $k$ observations is calculated by summing over the unknown state at each time, producing a product of $k$ matrices. The calculation, adapted from Satten and Longini (1996), is given by Jackson and Sharples (2002).
There must be enough information in the data on each state to estimate each transition rate, otherwise the likelihood will be flat and the maximum will not be found. It may be appropriate to reduce the number of states in the model, or reduce the number of covariate effects, to ensure convergence.
Choosing an appropriate set of initial values for the optimisation can also be important. For flat likelihoods, 'informative' initial values will often be required.
Satten, G.A. and Longini, I.M. Markov chains with measurement error: estimating the 'true' course of a marker of the progression of human immunodeficiency virus disease (with discussion) Applied Statistics 45(3): 275-309 (1996)
simmulti.msm
, print.msm
, plot.msm
,
summary.msm
, qmatrix.msm
,
pmatrix.msm
, sojourn.msm
,data(aneur)
print(aneur$from[1:10])
print(aneur$to[1:10])
print(aneur$dt[1:10])
### four states corresponding to increasing disease severity,
### with progressive transitions only
qmat <- rbind( c(0, 1, 0, 0), c(0, 0, 1, 0), c(0, 0, 0, 1), c(0, 0, 0,
0))
aneurysm.msm <- msm(data=aneur, fromto=TRUE, fromstate=from, tostate=to,
qmatrix=qmat, timelag=dt, inits=c(0.001, 0.03, 0.3),
method="BFGS", control=list(trace=2))
print(aneurysm.msm)
qmatrix.msm(aneurysm.msm) # Extract only the transition intensities from the printed results
pmatrix.msm(aneurysm.msm, 10) # Extract the 10 year transition probability matrix
sojourn.msm(aneurysm.msm) # Extract the mean sojourn times
Run the code above in your browser using DataLab