Generates simulated survival data from a previously created AFT data generating mechanism (DGM). Samples from the super population and generates survival times with specified censoring.
simulate_from_dgm(
dgm,
n = NULL,
rand_ratio = 1,
entry_var = NULL,
max_entry = 24,
analysis_time = 48,
cens_adjust = 0,
draw_treatment = TRUE,
seed = NULL,
strata_rand = NULL,
hrz_crit = NULL,
keep_rand = FALSE,
time_eos = NULL
)A data.frame with columns:
idSubject identifier.
treatOriginal treatment from super population.
treat_simSimulated treatment assignment.
flag_harmSubgroup indicator (1 = all subgroup conditions met).
z_*Covariate values.
lin_pred_1, lin_pred_0Counterfactual log-time linear predictors.
y_simObserved survival time (min(T, C)).
event_simEvent indicator (1 = event, 0 = censored).
t_trueLatent true survival time (pre-censoring).
c_timeEffective censoring time (post admin-censoring).
hrz_flag(Optional) Individual harm-zone indicator.
rand_order(Optional) Randomisation sequence index.
An object of class "aft_dgm_flex" created by
generate_aft_dgm_flex.
Integer specifying the sample size. If NULL (default),
uses the entire super population without sampling.
Numeric randomisation ratio (treatment:control).
Default 1 (1:1 allocation).
Character string naming an entry-time variable in the
super population. If NULL, entry times are drawn as
Uniform(0, max_entry). Default NULL.
Numeric maximum entry time for staggered entry simulation.
Only used when entry_var = NULL. Default 24.
Numeric calendar time of analysis. Follow-up is
analysis_time - entry_time. Must be on the same time scale as
the DGM (i.e. the same units as outcome_var passed to
generate_aft_dgm_flex). Default 48.
Numeric log-scale adjustment to censoring distribution.
Positive values increase censoring times; negative values decrease them.
Default 0 (no adjustment).
Logical. If TRUE (default), reassigns
treatment according to rand_ratio. If FALSE, retains
original treatment assignments from the super population.
Integer random seed. Default NULL.
Character string naming a column in the sampled data
for within-stratum balanced treatment allocation. If NULL,
marginal allocation is used. Default NULL.
Numeric log-HR threshold. If supplied, a column
hrz_flag is added marking subjects with
lin_pred_1 - lin_pred_0 >= hrz_crit. Default NULL.
Logical. If TRUE, appends a rand_order
column preserving the randomisation sequence. Default FALSE.
Numeric secondary administrative censoring cutoff
(end-of-study time on the DGM scale). Applied after follow_up
censoring. Default NULL.
All time parameters (analysis_time, max_entry,
time_eos) must be expressed in the same units as
outcome_var supplied to generate_aft_dgm_flex(). A common
error is building the DGM on days (e.g. rfstime) and then passing
analysis_time in months, which causes follow-up windows far shorter
than the DGM event-time scale and produces universal administrative
censoring (event_sim = 0 for all subjects).
Verify with: exp(dgm$model_params$mu) — the implied median event
time should be plausible given your analysis_time.
When n = NULL the entire super population is used as-is, with no
staggered entry and no administrative censoring (follow_up = Inf).
Treatment assignments and linear predictors already stored in
dgm$df_super are retained unchanged.
cens_adjust shifts the log-scale location parameter of the
censoring distribution:
cens_adjust = log(2) doubles expected censoring times.
cens_adjust = log(0.5) halves expected censoring times.
generate_aft_dgm_flex, check_censoring_dgm
# \donttest{
dgm <- setup_gbsg_dgm(model = "null", verbose = FALSE)
sim_data <- simulate_from_dgm(dgm, n = 200, seed = 42)
dim(sim_data)
head(sim_data[, c("y_sim", "event_sim", "treat_sim")])
# }
Run the code above in your browser using DataLab