This function is used to generate simulated data under various settings. Let \(Z\) be a \(p\)-dimensional vector of possible time-dependent covariates and \(\beta\) be the vector of regression coefficient. The survival times (\(T\)) are generated from the hazard function specified as follow:
Proportional hazards model: $$\lambda(t|Z) = \lambda_0(t) e^{-0.5 Z_1 + 0.5 Z_2 - 0.5 Z_3 ... + 0.5 Z_{10}},$$
Proportional hazards model with noise variable: $$\lambda(t|Z) = \lambda_0(t) e^{2Z_1 + 2Z_2 + 0Z_3 + ... + 0Z_{10}},$$
Proportional hazards model with nonlinear covariate effects: $$\lambda(t|Z) = \lambda_0(t) e^{[2\sin(2\pi Z_1) + 2|Z_2 - 0.5|]},$$
Accelerated failure time model: $$\log(T) = -2 + 2Z_1 + 2Z_2 + \epsilon,$$ where \(\epsilon\) follows \(N(0, 0.5^2).\)
Generalized gamma family: $$T = e^{\sigma\omega},$$ where \(\omega = \log(Q^2 g) / Q\), \(g\) follows Gamma(\(Q^{-2}, 1\)), \(\sigma = 2Z_1, Q = 2Z_2.\)
Dichotomous time dependent covariate with at most one change in value: $$\lambda(t|Z(t)) = \lambda_0(t)e^{2Z_1(t) + 2Z_2},$$ where \(Z_1(t)\) is the time-dependent covariate: \(Z_1(t) = \theta I(t \ge U_0) + (1 - \theta) I(t < U_0)\), ,\(\theta\) is a Bernoulli variable with equal probability, and \(U_0\) follows a uniform distribution over \([0, 1]\).
Dichotomous time dependent covariate with multiple changes: $$\lambda(t|Z(t)) = e^{2Z_1(t) + 2Z_2},$$ where \(Z_1(t) = \theta[I(U_1\le t < U_2) + I(U_3 \le t)] + (1 - \theta)[I(t < U_1) + I(U_2\le t < U_3)]\), \(\theta\) is a Bernoulli variable with equal probability, and \(U_1\le U_2\le U_3\) are the first three terms of a stationary Poisson process with rate 10.
Proportional hazard model with a continuous time dependent covariate: $$\lambda(t|Z(t)) = 0.1 e^{Z_1(t) + Z_2},$$ where \(Z_1(t) = kt + b\), \(k\) and \(b\) are independent uniform random variables over \([1, 2]\).
Non-proportional hazards model with a continuous time dependent covariate: $$\lambda(t|Z(t)) = 0.1 \cdot[1 + \sin\{Z_1(t) + Z_2\}],$$ where \(Z_1(t) = kt + b\), \(k\) and \(b\) follow independent uniform distributions over \([1, 2]\).
Non-proportional hazards model with a nonlinear time dependent covariate: $$\lambda(t|Z(t)) = 0.1 \cdot[1 + \sin\{Z_1(t) + Z_2\}],$$ where \(Z_1(t) = 2kt\cdot \{I(t > 5) - 1\} + b\), \(k\) and \(b\) follow independent uniform distributions over \([1, 2]\).
simu(n, cen, scenario, summary = FALSE)trueHaz(dat)
trueSurv(dat)
an integer value indicating the number of subjects.
is a numeric value indicating the censoring percentage; three levels, 0%, 25%, 50%, are allowed.
can be either a numeric value or a character string. This indicates the simulation scenario noted above.
a logical value indicating whether a brief data summary will be printed.
is a data.frame prepared by simu.
simu returns a data.frame.
The returned data.frame consists of columns:
is the subject id.
is the observed follow-up time.
is the death indicator; death = 0 if censored.
is the possible time-independent covariate.
are the latent variables used to generate $Z_1(t)$ in Scenario 2.1 -- 2.5.
The returned data.frame can be supply to trueHaz and trueSurv to generate the true cumulative hazard function and the survival function, respectively.
# NOT RUN {
set.seed(1)
simu(10, 0.25, 1.2, TRUE)
set.seed(1)
simu(10, 0.50, 2.2, TRUE)
# }
Run the code above in your browser using DataLab