bayessurvreg2: Cluster-specific accelerated failure time model for multivariate, possibly doubly-interval-censored data. The error distribution is expressed as a~penalized univariate normal mixture with high number of components (G-spline). The distribution of the vector of random effects is multivariate normal.

Description

A function to estimate a regression model with possibly clustered (possibly right, left, interval or doubly-interval censored) data. In the case of doubly-interval censoring, different regression models can be specified for the onset and event times.

(Multivariate) random effects, normally distributed and acting as in the linear mixed model, normally distributed, can be included to adjust for clusters.

The error density of the regression model is specified as a mixture of Bayesian G-splines (normal densities with equidistant means and constant variances). This function performs an MCMC sampling from the posterior distribution of unknown quantities.

For details, see Komárek (2006), and Komárek, Lesaffre and Legrand (2007).

We explain first in more detail a model without doubly censoring. Let $T_{i,l},\; i=1,\dots, N,\; l=1,\dots, n_i$ be event times for $i$th cluster and the units within that cluster The following regression model is assumed: $$\log(T_{i,l}) = \beta'x_{i,l} + b_i'z_{i,l} + \varepsilon_{i,l},\quad i=1,\dots, N,\;l=1,\dots, n_i$$ where $\beta$ is unknown regression parameter vector, $x_{i,l}$ is a vector of covariates. $b_i$ is a (multivariate) cluster-specific random effect vector and $z_{i,l}$ is a vector of covariates for random effects.

The random effect vectors $b_i,\;i=1,\dots, N$ are assumed to be i.i.d. with a (multivariate) normal distribution with the mean $\beta_b$ and a~covariance matrix $D$. Hierarchical centring (see Gelfand, Sahu, Carlin, 1995) is used. I.e. $\beta_b$ expresses the average effect of the covariates included in $z_{i,l}$. Note that covariates included in $z_{i,l}$ may not be included in the covariate vector $x_{i,l}$. The covariance matrix $D$ is assigned an inverse Wishart prior distribution in the next level of hierarchy. The error terms $\varepsilon_{i,l},\;i=1,\dots, N, l=1,\dots, n_i$ are assumed to be i.i.d. with a~univariate density $g_{\varepsilon}(e)$. This density is expressed as a~mixture of Bayesian G-splines (normal densities with equidistant means and constant variances). We distinguish two, theoretically equivalent, specifications.

[object Object],[object Object] Personally, I found Specification 2 performing better. In the paper Komárek, Lesaffre and Legrand (2007) only Specification 2 is described.

The mixture weights $w_{j},\;j=-K,\dots, K$ are not estimated directly. To avoid the constraints $0 < w_{j} < 1$ and $\sum_{j=-K}^{K}\,w_j = 1$ transformed weights $a_{j},\;j=-K,\dots, K$ related to the original weights by the logistic transformation: $$a_{j} = \frac{\exp(w_{j})}{\sum_{m}\exp(w_{m})}$$ are estimated instead.

A~Bayesian model is set up for all unknown parameters. For more details I refer to Komárek (2006) and to Komárek, Lesafre, and Legrand (2007). If there are doubly-censored data the model of the same type as above can be specified for both the onset time and the time-to-event.

Usage

bayessurvreg2(formula, random, formula2, random2,
   data = parent.frame(),
   na.action = na.fail, onlyX = FALSE,
   nsimul = list(niter = 10, nthin = 1, nburn = 0, nwrite = 10),
   prior, prior.beta, prior.b, init = list(iter = 0),
   mcmc.par = list(type.update.a = "slice", k.overrelax.a = 1,
                   k.overrelax.sigma = 1, k.overrelax.scale = 1),
   prior2, prior.beta2, prior.b2, init2,
   mcmc.par2 = list(type.update.a = "slice", k.overrelax.a = 1,
                    k.overrelax.sigma = 1, k.overrelax.scale = 1),
   store = list(a = FALSE, a2 = FALSE, y = FALSE, y2 = FALSE,
                r = FALSE, r2 = FALSE, b = FALSE, b2 = FALSE), 
   dir = getwd())

Arguments

formula

model formula for the regression. In the case of doubly-censored data, this is the model formula for the onset time.

The left-hand side of the formula must be an~object created using

random

formula for the `random' part of the model, i.e. the part that specifies the covariates $z_{i,l}$. In the case of doubly-censored data, this is the random formula for the onset time.

If omitted, no random part is included in

formula2

model formula for the regression of the time-to-event in the case of doubly-censored data. Ignored otherwise. The same structure as for formula applies here.

random2

specification of the `random' part of the model for time-to-event in the case of doubly-censored data. Ignored otherwise. The same structure as for random applies here.

data

optional data frame in which to interpret the variables occuring in the formula, formula2, random, random2 statements.

na.action

the user is discouraged from changing the default value na.fail.

onlyX

if TRUE no MCMC sampling is performed and only the design matrix (matrices) are returned. This can be useful to set up correctly priors for regression parameters in the presence of factor covariates.

nsimul

a list giving the number of iterations of the MCMC and other parameters of the simulation. [object Object],[object Object],[object Object],[object Object]

prior

a~list specifying the prior distribution of the G-spline defining the distribution of the error term in the regression model given by formula and random. See prior argument of

prior.b

a list defining the way in which the random effects involved in formula and random are to be updated and the specification of priors for parameters related to these random effects. The list is assumed to have the

prior.beta

prior specification for the regression parameters, in the case of doubly-censored data for the regression parameters of the onset time, i.e. it is related to formula and random. Note that the beta vec

init

an~optional list with initial values for the MCMC related to the model given by formula and random. The list can have the following components: [object Object],[object Object],[object Object],[object Object],[object Objec

mcmc.par

a~list specifying how some of the G-spline parameters related to the distribution of the error term from formula are to be updated. See bayesBisurvreg for more details.

prior2

a~list specifying the prior distribution of the G-spline defining the distribution of the error term in the regression model given by formula2 and random2. See prior argument of

prior.b2

prior specification for the parameters related to the random effects from formula2 and random2. This should be a~list with the same structure as prior.b.

prior.beta2

prior specification for the regression parameters of time-to-event in the case of doubly censored data (related to formula2 and random2). This should be a~list with the same structure as prior.beta.

init2

an~optional list with initial values for the MCMC related to the model given by formula2 and random2. The list has the same structure as init.

mcmc.par2

a~list specifying how some of the G-spline parameters related to formula2 are to be updated. The list has the same structure as mcmc.par.

store

a~list of logical values specifying which chains that are not stored by default are to be stored. The list can have the following components. [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[obje

dir

a string that specifies a directory where all sampled values are to be stored.

Value

A list of class bayessurvreg2 containing an information concerning the initial values and prior choices.

References

Gelfand, A. E., Sahu, S. K., and Carlin, B. P. (1995). Efficient parametrisations for normal linear mixed models. Biometrika, 82, 479-488.

Komárek, A. (2006). Accelerated Failure Time Models for Multivariate Interval-Censored Data with Flexible Distributional Assumptions. PhD. Thesis, Katholieke Universiteit Leuven, Faculteit Wetenschappen.

Komárek, A., Lesaffre, E., and Legrand, C. (2007). Baseline and treatment effect heterogeneity for survival times between centers using a random effects accelerated failure time model with flexible error distribution. Statistics in Medicine, 26, 5457-5472.

Examples

Run this code

## See the description of R commands for
## the model with EORTC data,
## analysis described in Komarek, Lesaffre and Legrand (2007).
##
## R commands available in the documentation
## directory of this package
## as ex-eortc.R and
## http://www.karlin.mff.cuni.cz/~komarek/software/bayesSurv/ex-eortc.pdf
##

Run the code above in your browser using DataLab