This function generates survival data according to the simulation scenarios considered in Section 4 of Wu, J., and Witten, D. (2019) Flexible and interpretable models for survival data. Cox model has the form
$$ \lambda(t|x) = \lambda_0(t) exp(\sum_{j=1}^p f_j(x)) $$.
Failure time is generated by Weibull distribution with baseline hazard $$\lambda_0(t) = scale * shape * t ^ {shape-1}$$. In the paper, however, failure time is generated by a simplied weibull distribution: exponential(1) baseline hazard corresponding to shape=1 and scale=1. Censoring time is generated independently by exponential distribution with intensity censoring.rate. Thus the observed time is the minimum of failure time and censoring time. Each scenario has four covariates that have some non-linear association with the outcome. There is the option to also generate a user-specified number of covariates that have no association with the outcome.
sim_dat(n, zerof=0, scenario=1, scale=1, shape=1, censoring.rate=0.01, n.discrete=0)number of observations.
Simulation scenario. Options are 1, 2, 3, 4. Scenario 1 corresponds to piecewise constant functions, scenario 2 corresponds to smooth functions, scenario 3 corresponds to piecewise linear functions, and scenario 4 corresponds to functions that have varying degrees of smoothness. Each scenario has four covariates that have some non-linear association with the outcome.
Number of additional covariates that have no association with the outcome. The total number of covariates is 4+zerof.
scale parameter as in rweibull
shape parameter as in rweibull
censoring intensity. Censoring time is generated by exponential distribution with intensity censoring.rate.
The number of binary covariates and default is zero binary covariate.
failure or censoring time whichever comes first.
censoring indicator. 1 denotes censoring and 0 denotes failure.
n x p covariate matrix.
n x p matrix.
Jiacheng Wu & Daniela Witten (2019) Flexible and Interpretable Models for Survival Data, Journal of Computational and Graphical Statistics, DOI: 10.1080/10618600.2019.1592758
# NOT RUN {
#generate data
set.seed(123)
dat = sim_dat(n=100, zerof=0, scenario=1)
#plot X versus the true theta
plot.sim_dat(dat)
# }
Run the code above in your browser using DataLab