draic.msm(msm.full, msm.coarse, likelihood.only=FALSE,
information=c("expected","observed"), tl=0.95)
drlcv.msm(msm.full, msm.coarse, tl=0.95, cores=NULL,
verbose=TRUE,outfile=NULL)
The two models must be fitted to the same datasets, except that the state space of the coarse model must be an aggregated version of the state space of the full model. That is, every state in the full dataset must correspo
drlcv
for
cross-validation by parallel processing. Requires the
cores
is set tdraic.msm
) or
$D_{RLCV}$ (drlcv.msm
), its component terms, and tracking intervals.$$D_{RAIC} = l(\gamma_n |\mathbf{x}'' ) - l(\theta_n |\mathbf{x}'' ) + trace ( J(\theta_n |\mathbf{x}'')J(\theta_n |\mathbf{x})^{-1} - J(\gamma_n |\mathbf{x}'' )J(\gamma_n |\mathbf{x}' )^{-1})$$
where $\gamma$ and $\theta$ are the maximum likelihood estimates of the smaller and bigger models, fitted to the smaller and bigger data, respectively.
$l(\gamma_n |x'')$ represents the likelihood of the simpler model evaluated on the restricted data.
$l(\theta_n |x'')$ represents the likelihood of the complex model evaluated on the restricted data. This is a hidden Markov model, with a misclassification matrix and initial state occupancy probabilities as described by Thom et al (2014).
$J()$ are the corresponding (expected or observed, as specified by the user) information matrices.
$\mathbf{x}$ is the expanded data, to which the bigger model was originally fitted, and $\mathbf{x}'$ is the data to which the smaller model was originally fitted. $\mathbf{x}''$ is the restricted data which the two models have in common. $\mathbf{x}'' = \mathbf{x}'$ in this implementation, so the models are nested.
The difference in likelihood cross-validatory criteria (Liquet and Commenges, 2011) is defined as
$$D_{RLCV} = 1/n \sum_{i=1}^n \log( h_{X''}(x_i'' | \gamma_{-i}) / g_{X''}(x_i''| \theta_{-i}))$$
where $\gamma_{-i}$ and $\theta_{-i}$ are the maximum likelihood estimates from the smaller and bigger models fitted to datasets with subject $i$ left out, $g()$ and $h()$ are the densities of the corresponding models, and $x_i''$ is the restricted data from subject $i$.
Tracking intervals are analogous to confidence intervals, but not strictly the same, since the quantity which D_RAIC aims to estimate, the difference in expected Kullback-Leibler discrepancy for predicting a replicate dataset, depends on the sample size. See the references.
Liquet, B. and Commenges D. (2011) Choice of estimators based on different observations: Modified AIC and LCV criteria. Scandinavian Journal of Statistics; 38:268-287.
logLik.msm