prevalence.msm: Tables of observed and expected prevalences

Description

This provides a rough indication of the goodness of fit of a multi-state model, by estimating the observed numbers of individuals occupying each state at a series of times, and comparing these with forecasts from the fitted model.

Usage

prevalence.msm(x, times=NULL, timezero=NULL, initstates=NULL, covariates="mean",
               misccovariates="mean", piecewise.times=NULL, piecewise.covariates=NULL,
	       ci=c("none","normal","bootstrap"), cl=0.95, B=1000,
               interp=c("start","midpoint"), plot=FALSE)

Arguments

A fitted multi-state model produced by msm.

times

Series of times at which to compute the observed and expected prevalences of states.

timezero

Initial time of the Markov process. Expected values are forecasted from here. Defaults to the minimum of the observation times given in the data.

initstates

Optional vector of the same length as the number of states. Gives the numbers of individuals occupying each state at the initial time, to be used for forecasting expected prevalences. The default is those observed in the data. These should ad

covariates

Covariate values for which to forecast expected state occupancy. See qmatrix.msm. Defaults to the mean values of the covariates in the data set.

misccovariates

(Misclassification models only) Values of covariates on the misclassification probability matrix for which to forecast expected state occupancy. Defaults to the mean values of the covariates in the data set.

piecewise.times

Times at which piecewise-constant intensities change. See pmatrix.piecewise.msm for how to specify this.

piecewise.covariates

Covariates on which the piecewise-constant intensities depend. See pmatrix.piecewise.msm for how to specify this.

If "normal", then calculate a confidence interval for the expected prevalences by simulating B random vectors from the asymptotic multivariate normal distribution implied by the maximum likelihood estimates (and covar

Width of the symmetric confidence interval, relative to 1

Number of bootstrap replicates

interp

Suppose an individual was observed in states $S_{r-1}$ and $S_r$ at two consecutive times $t_{r-1}$ and $t_r$, and we want to estimate 'observed' prevalences at a time $t$ between $t_{r-1}$ and $t_r$.

If interp="start", then individu

plot

Generate a plot of observed against expected prevalences. See plot.prevalence.msm

Value

A list of matrices, with components:
ObservedTable of observed numbers of individuals in each state at each time
Observed percentagesCorresponding percentage of the individuals at risk at each time.
ExpectedTable of corresponding expected numbers.
Expected percentagesCorresponding percentage of the individuals at risk at each time.
Or if ci.boot = TRUE, the component Expected is a list with components estimates and ci. estimates is a matrix of the expected prevalences, and ci is a list of two matrices, containing the confidence limits. The component Expected percentages has a similar format.

concept

Goodness of fit

Details

The fitted transition probability matrix is used to forecast expected prevalences from the state occupancy at the initial time. To produce the expected number in state $j$ at time $t$ after the start, the number of individuals under observation at time $t$ (including those who have died, but not those lost to follow-up) is multiplied by the product of the proportion of individuals in each state at the initial time and the transition probability matrix in the time interval $t$. The proportion of individuals in each state at the "initial" time is estimated, if necessary, in the same way as the observed prevalences.

For misclassification models (fitted using an ematrix), this aims to assess the fit of the full model for the observed states. That is, the combined Markov progression model for the true states and the misclassification model. Thus, expected prevalences of true states are estimated from the assumed proportion occupying each state at the initial time using the fitted transition probabiliy matrix. The vector of expected prevalences of true states is then multiplied by the fitted misclassification probability matrix to obtain the expected prevalences of observed states.

For general hidden Markov models, the observed state is taken to be the predicted underlying state from the Viterbi algorithm (viterbi.msm). The goodness of fit of these states to the underlying Markov model is tested.

Note that this function assumes intensities are the same for all individuals. By default they are taken from the mean values of the covariates in the population. Piecewise-constant intensities may be assumed, through the arguments piecewise.times and piecewise.covariates. For an example of this approach, see Gentleman et al. (1994).

References

Gentleman, R.C., Lawless, J.F., Lindsey, J.C. and Yan, P. Multi-state Markov models for analysing incomplete disease history data with illustrations for HIV disease. Statistics in Medicine (1994) 13(3): 805--821.