Simulates multi-regional clinical trials and evaluates ForestSearch subgroup identification. Splits data by region into training and testing populations, identifies subgroups using ForestSearch on training data, and evaluates performance on the testing region.
mrct_region_sims(
dgm,
n_sims,
n_sample = NULL,
region_var = "z_regA",
sg_focus = "minSG",
maxk = 1,
hr.threshold = 0.9,
hr.consistency = 0.8,
pconsistency.threshold = 0.9,
confounders.name = NULL,
conf_force = NULL,
fs_args = list(),
sim_args = list(rand_ratio = 1, draw_treatment = TRUE),
analysis_time = 60,
cens_adjust = 0,
parallel_args = list(plan = "multisession", workers = NULL, show_message = TRUE),
details = FALSE,
verbose_n_sims = 2L,
seed = NULL
)A data.table with simulation results containing:
Simulation index
ITT sample size
ITT hazard ratio (stratified if strat variable present)
ITT hazard ratio stratified by region
Training (non-region A) sample size
Training population hazard ratio
Testing (region A) sample size
Testing population hazard ratio
Indicator: 1 if subgroup identified, 0 otherwise
Character description of identified subgroup
Subgroup sample size
Subgroup hazard ratio in testing population
Potential outcome hazard ratio in subgroup (testing)
Subgroup prevalence (proportion of testing population)
Subgroup sample size in training population
Subgroup hazard ratio in training population
Potential outcome hazard ratio in subgroup (training)
Subgroup HR when found, NA otherwise
Data generating mechanism object from generate_aft_dgm_flex
Integer. Number of simulations to run
Integer. Sample size per simulation. If NULL (default), uses the entire super-population from dgm
Character. Name of the region indicator variable used to split data into training (region_var == 0) and testing (region_var == 1) populations. Default: "z_regA"
Character. Subgroup selection criterion passed to
forestsearch: "minSG", "hr", or "maxSG". Default: "minSG"
Integer. Maximum number of factors in subgroup combinations (1 or 2). Default: 1
Numeric. Hazard ratio threshold for subgroup identification. Default: 0.90
Numeric. Consistency threshold for hazard ratio. Default: 0.80
Numeric. Probability threshold for consistency. Default: 0.90
Character vector. Confounder variable names for ForestSearch. If NULL, automatically extracted from dgm
Character vector. Forced cuts to consider in ForestSearch. Default: c("z_age <= 65", "z_bm <= 0", "z_bm <= 1", "z_bm <= 2", "z_bm <= 5")
Named list. Additional arguments passed directly to
forestsearch inside each simulation replicate. Use this to
control parameters not exposed by mrct_region_sims (e.g.,
use_grf, use_lasso, cut_type, d0.min,
d1.min, n.min, max_subgroups_search,
use_twostage, twostage_args).
Parameters already in the mrct_region_sims signature
(hr.threshold, hr.consistency, pconsistency.threshold,
sg_focus, maxk, confounders.name, conf_force)
take precedence over values in fs_args.
Default: list() (uses forestsearch defaults)
Named list. Additional arguments passed to
simulate_from_dgm inside each replicate (e.g.,
rand_ratio, draw_treatment).
Parameters already in the mrct_region_sims signature
(analysis_time, cens_adjust) take precedence.
Default: list(rand_ratio = 1, draw_treatment = TRUE)
Numeric. Time of analysis for administrative censoring. Default: 60
Numeric. Adjustment factor for censoring rate on log scale. Default: 0
List. Parallel processing configuration with components:
plan: "multisession", "multicore", "callr", or "sequential"
workers: Number of workers (NULL for auto-detect)
show_message: Logical for progress messages
Logical. Print detailed progress information. Default: FALSE
Integer. When details = TRUE, print full
ForestSearch diagnostics (including internal output) for only the first
verbose_n_sims simulation replicates. Set to 0 to suppress per-sim
output, or Inf to print all. Default: 2
Integer. Base random seed for reproducibility. Default: NULL
For each simulation:
Sample from super-population using simulate_from_dgm
Split by region_var into training and testing populations
Estimate HRs in ITT, training, and testing populations
Run forestsearch on training population
Apply identified subgroup to testing population
Calculate subgroup-specific estimates
The region_var parameter is used ONLY for splitting data into training/testing
populations. It does not imply any prognostic effect. To include prognostic
confounder effects, specify them when creating the DGM using
create_dgm_for_mrct or generate_aft_dgm_flex.
forestsearch for subgroup identification algorithm
generate_aft_dgm_flex for DGM creation
simulate_from_dgm for data simulation
create_dgm_for_mrct for MRCT-specific DGM wrapper
summaryout_mrct for summarizing simulation results