Bayesian survival model for right-censored data, using a sum of two hazard functions, each having a power dependence on time, corresponding to a Weibull distribution on event density. (Note that event density function for the mixture model does NOT remain a Weibull distribution.) Each component has a different shape and scale parameter, with scale parameters each being the exponential of a linear function of covariates specified in formula1 and formula2. Stratification is implemented using a common set of intercepts between the two components. Lasso shrinkage - using Laplace prior on coefficients (Park and Casella 2008) - allows for variable selection in the presence of low observation-to-variable ratio. The mixture model allows for time-dependent (and context-dependent) hazard ratios. Confidence intervals for coefficient estimation and prediction are generated using full Bayesian paradigm, i.e. by keeping all samples rather than summarizing them into mean and sd. Posterior distribution is estimated via MCMC sampling, using univariate slice sampler with stepout and shrinkage (Neal 2003).
bayesmixsurv(formula1, data, formula2=formula1, stratCol=NULL, weights, subset
, na.action=na.fail, control=bayesmixsurv.control(), print.level=2)
bayesmixsurv.control(single=FALSE, alpha2.fixed=NULL, alpha.boundary=1.0, lambda1=1.0
, lambda2=lambda1, iter=1000, burnin=round(iter/2), sd.thresh=1e-4, scalex=TRUE
, nskip=round(iter/10))
# S3 method for bayesmixsurv
print(x, ...)The function bayesmixsurv.control return a list with the same elements as its input parameters. The function bayesmixsurv returns object of class bayesmixsurv, with the following components:
The matched call
Same as input.
Same as input.
Same as input. *Not supported yet*
Same as input. *Not supported yet*
Same as input. *Not supported yet* (current behavior is na.fail)
Same as input.
Model matrix used for component 1, after potential centering and scaling.
Model matrix used for component 2, after potential centering and scaling.
Survival response variable (time and status) used in the model.
The contrasts used for component 1 (where relevant).
The contrasts used for component 2 (where relevant).
A record of the levels of the factors used in fitting for component 1 (where relevant).
A record of the levels of the factors used in fitting for component 2 (where relevant).
The terms object used for component 1.
The terms object used for component 2.
Names of columns for X1, also names of scale coefficients for component 1.
Names of columns for X1, also names of scale coefficients for component 2.
Index of columns of X1 where scaling has been applied.
Index of columns of X2 where scaling has been applied.
Vector of centering parameters for columns of X1 indicated by apply.scale.X1.
Vector of centering parameters for columns of X2 indicated by apply.scale.X2.
Vector of scaling parameters for columns of X1 indicated by apply.scale.X1.
Vector of scaling parameters for columns of X2 indicated by apply.scale.X2.
Model matrix associated with stratification (if any).
The contrasts used for stratification model matrix, if any.
A record of the levels of the factors used in stratification (if any)).
The terms object used for stratification.
Names of columns for Xg.
Vector of indexes into X1 for which sampling occured. All columns of X1 whose standard deviation falls below sd.thresh are excluded from sampling and their corresponding coefficients are clamped to 0.
Vector of indexes into X2 for which sampling occured. All columns of X2 whose standard deviation falls below sd.thresh are excluded from sampling and their corresponding coefficients are clamped to 0.
List of median values, with elements including alpha1,alpha2 (shape parameter of components 1 and 2), beta1,beta2 (coefficients of scale parameter for components 1 and 2), gamma (stratification intercept adjustments, shared by 2 comoponents), and sigma.gamma (standard deviation of zero-mean Gaussian distribution that is the prior for gamma's).
Currently, a list with one element, loglike, containing the maximum sampled log-likelihood of the model.
List of coefficient samples, with elements alpha1,alpha2 (shape parameters for components 1 and 2), beta1,beta2 (scale parameter coefficients for components 1 and 2), loglike (model log-likelihood), gamma (stratification intercept adjustments, shared by 2 comoponents), and sigma.gamma (standard deviation of zero-mean Gaussian distribution that is the prior for gamma's). Each parameter has iter samples. For vector parameters, first dimension is the number of samples (iter), while the second dimension is the length of the vector.
Survival formula expressing the time/status variables as well as covariates used in the first component.
Data frame containing the covariates and response variable, as well as the stratification column.
Survival formula expressing the covariates used in the second component. No left-hand side is necessary since the response variable information is extracted from formula1. Defaults to formula1.
Name of column in data used for stratification. Must be a factor or coerced into one. Default is no stratification (stratCol=NULL).
Optional vector of case weights. *Not supported yet*
Subset of the observations to be used in the fit. *Not supported yet*
Missing-data filter function. *Not supported yet (only na.fail behavior works)*
See bayesmixsurv.control for a description of the parameters inside the control list.
Controlling verbosity level.
If TRUE, a single-component model, equivalent to Bayesian Weibull survival regression, with Lasso shrinkage, is implemented. Default is FALSE, i.e. a two-component mixture-of-Weibull model.
If provided, it specifies the shape parameter of the second component. Default is NULL, which allows the MCMC sampling to estimate both shape parameters.
When single=FALSE and alpha2.fixed=NULL, this parameter specifies an upper bound for the shape parameter of the first component, and a lower bound for the shape parameter of the second component. These boundary conditions are enforced in the univariate slice sampler function calls.
Lasso Shrinkage parameter used in the Laplace prior on covariates used in the first component.
Lasso Shrinkage parameter used in the Laplace prior on covariates used in the second component. Defaults to lambda1.
Number of posterior MCMC samples to generate.
Number of initial MCMC samples to discard before calculating summary statistics.
Threshold for standard deviation of a covariate (after possible centering/scaling). If below the threshold, the corresponding coefficient is removed from sampling, i.e. its value is clamped to zero.
If TRUE, each covariate vector is centered and scaled before model estimation. The scaling parameters are saved in return object, and used in subsequent calls to predict function. Users are strongly advised against turning this feature off, since the quality of Gibbs sampling MCMC is greatly enhanced by covariate centering and scaling.
Controlling how often to print progress report during MCMC run. For example, if nskip=10, progress will be reported after 10,20,30,... samples.
Object of class 'bayesmixsurv', usually the result of a call to bayesmixsurv.
Arguments to be passed to/from other methods.
Alireza S. Mahani, Mansour T.A. Sharabiani
Neal R.M. (2003). Slice Sampling. Annals of Statistics, 31, 705-767.
Park T. and Casella G. (2008) The Bayesian Lasso. Journal of the American Statistical Association, 103, 681-686.
# NOTE: to ensure convergence, typically more than 100 samples are needed
# fit the most general model, with two Weibull components and unspecified shape parameters
ret <- bayesmixsurv(Surv(time, status)~as.factor(trt)+age+as.factor(celltype)+prior, veteran
, control=bayesmixsurv.control(iter=100))
# fix one of the two shape parameters
ret2 <- bayesmixsurv(Surv(time, status)~as.factor(trt)+age+as.factor(celltype)+prior, veteran
, control=bayesmixsurv.control(iter=100, alpha2.fixed=1.0))
Run the code above in your browser using DataLab