sim_dat: Simulate Data from a Variety of Functional Scenarios

Description

This function generates survival data according to the simulation scenarios considered in Section 4 of Wu, J., and Witten, D. (2019) Flexible and interpretable models for survival data. Cox model has the form $$ \lambda(t|x) = \lambda_0(t) exp(\sum_{j=1}^p f_j(x)) $$. Failure time is generated by Weibull distribution with baseline hazard $$\lambda_0(t) = scale * shape * t ^ {shape-1}$$. In the paper, however, failure time is generated by a simplied weibull distribution: exponential(1) baseline hazard corresponding to shape=1 and scale=1. Censoring time is generated independently by exponential distribution with intensity censoring.rate. Thus the observed time is the minimum of failure time and censoring time. Each scenario has four covariates that have some non-linear association with the outcome. There is the option to also generate a user-specified number of covariates that have no association with the outcome.

Usage

sim_dat(n, zerof=0, scenario=1, scale=1, shape=1, censoring.rate=0.01, n.discrete=0)

Arguments

number of observations.

scenario

Simulation scenario. Options are 1, 2, 3, 4. Scenario 1 corresponds to piecewise constant functions, scenario 2 corresponds to smooth functions, scenario 3 corresponds to piecewise linear functions, and scenario 4 corresponds to functions that have varying degrees of smoothness. Each scenario has four covariates that have some non-linear association with the outcome.

zerof

Number of additional covariates that have no association with the outcome. The total number of covariates is 4+zerof.

scale

scale parameter as in rweibull

shape

shape parameter as in rweibull

censoring.rate

censoring intensity. Censoring time is generated by exponential distribution with intensity censoring.rate.

n.discrete

The number of binary covariates and default is zero binary covariate.

Value

time

failure or censoring time whichever comes first.

status

censoring indicator. 1 denotes censoring and 0 denotes failure.

n x p covariate matrix.

true_theta

n x p matrix.

References

Jiacheng Wu & Daniela Witten (2019) Flexible and Interpretable Models for Survival Data, Journal of Computational and Graphical Statistics, DOI: 10.1080/10618600.2019.1592758

Examples

Run this code

# NOT RUN {
#generate data
set.seed(123)
dat = sim_dat(n=100, zerof=0, scenario=1)
#plot X versus the true theta
plot.sim_dat(dat)
# }