Evaluates candidate subgroups using split-sample consistency validation. For each candidate, repeatedly splits the data and checks whether the treatment effect direction is consistent across splits.
subgroup.consistency(
df,
hr.subgroups,
hr.threshold = 1,
hr.consistency = 1,
pconsistency.threshold = 0.9,
m1.threshold = Inf,
n.splits = 100,
details = FALSE,
by.risk = 12,
plot.sg = FALSE,
maxk = 7,
Lsg,
confs_labels,
sg_focus = "hr",
stop_Kgroups = 10,
stop_threshold = NULL,
showten_subgroups = FALSE,
pconsistency.digits = 2,
seed = 8316951,
checking = FALSE,
use_twostage = FALSE,
twostage_args = list(),
parallel_args = list()
)A list containing:
Selected subgroup results
Selection criterion used
Data frame with treatment recommendations
Subgroup definition labels
Subgroup membership indicator
"twostage" or "fixed"
Number of candidates actually evaluated
Total candidates available
Number meeting consistency threshold
Logical indicating if early stop occurred
Index of candidate triggering early stop
Threshold used for early stopping
Random seed used for reproducibility (NULL if not set)
Data frame containing the analysis dataset. Must include columns for outcome (Y), event indicator (Event), and treatment (Treat).
Data.table of candidate subgroups from subgroup search, containing columns: HR, n, E, K, d0, d1, m0, m1, grp, and factor indicators.
Numeric. Minimum hazard ratio threshold for candidates. Default: 1.0
Numeric. Minimum HR required in each split for consistency. Default: 1.0
Numeric. Minimum proportion of splits that must be consistent. Default: 0.9
Numeric. Maximum m1 threshold for filtering. Default: Inf
Integer. Number of splits for consistency evaluation. Default: 100
Logical. Print progress details. Default: FALSE
Numeric. Risk interval for KM plots. Default: 12
Logical. Generate subgroup plots. Default: FALSE
Integer. Maximum number of factors in subgroup. Default: 7
List of subgroup parameters.
Character vector mapping factor names to labels.
Character. Subgroup selection criterion: "hr", "maxSG", or "minSG". Default: "hr"
Integer. Maximum number of candidates to evaluate. Default: 10
Numeric in [0, 1] or NULL.
When a candidate subgroup's consistency probability (Pcons)
meets or exceeds this threshold, evaluation stops early — remaining
candidates are skipped. Set to NULL to disable early stopping
and evaluate all candidates up to stop_Kgroups. Default: NULL.
Note: Values > 1.0 are not permitted. To disable early
stopping, use stop_threshold = NULL, not a value above 1.
Interaction with sg_focus:
"hr", "maxSG", "minSG"Early stopping is valid because candidates are sorted by a single criterion. The first candidate passing the threshold is optimal under that criterion.
"hrMaxSG", "hrMinSG"Should generally be
NULL, because these compound criteria require comparing
HR and size across all candidates. forestsearch()
automatically resets to NULL with a warning for these.
For parallel execution, early stopping is checked after each batch
completes, so some additional candidates beyond the first meeting the
threshold may be evaluated. Use a smaller batch_size in
parallel_args for finer-grained early stopping.
Logical. If TRUE, prints up to 10 candidate subgroups after sorting by sg_focus, showing their rank, HR, sample size, events, and factor definitions. Useful for reviewing which candidates will be evaluated for consistency. Default: FALSE
Integer. Decimal places for consistency proportion. Default: 2
Integer. Random seed for reproducible consistency splits. Default: 8316951. Set to NULL for non-reproducible random splits. The seed is used both for sequential execution (via set.seed()) and parallel execution (via future.seed).
Logical. Enable additional validation checks. Default: FALSE
Logical. Use two-stage adaptive algorithm. Default: FALSE
List. Parameters for two-stage algorithm:
Splits for Stage 1 screening. Default: 30
Consistency threshold for Stage 1. Default: auto
Splits per batch in Stage 2. Default: 20
Confidence level for early stopping. Default: 0.95
Minimum valid Stage 1 splits. Default: 10
List. Parallel processing configuration:
Future plan: "multisession", "multicore", or "sequential"
Number of parallel workers
Number of candidates to evaluate per batch. Smaller values provide finer-grained early stopping but may increase overhead. Default: When stop_threshold is set and sg_focus is "hr" or "minSG", defaults to 1 (stop immediately when first candidate passes). For other sg_focus values with stop_threshold, defaults to min(workers, n_candidates/4). When stop_threshold is NULL, defaults to workers*2 for efficiency.
Print parallel config messages