This function assesses the stability and reproducibility of ForestSearch subgroup identification through cross-validation. For each fold:
Train ForestSearch on (K-1) folds
Apply the identified subgroup to the held-out fold
Compare predictions to the original full-data analysis
forestsearch_Kfold(
fs.est,
Kfolds = nrow(fs.est$df.est),
seedit = 8316951L,
parallel_args = list(plan = "multisession", workers = 6, show_message = TRUE),
sg0.name = "Not recommend",
sg1.name = "Recommend",
details = FALSE
)List with components:
Data frame with CV predictions for each observation
Arguments used for CV ForestSearch calls
Execution time in minutes
Percentage of folds where a subgroup was found
Original subgroup definition from full-data analysis
Subgroup labels
Number of folds used
Named vector of sensitivity metrics (sens_H, sens_Hc, ppv_H, ppv_Hc)
Named vector of subgroup-finding metrics (Any, Exact, etc.)
List. ForestSearch results object from forestsearch.
Must contain df.est (data frame) and args_call_all (list of arguments).
Integer. Number of folds (default: nrow(fs.est$df.est) for LOO).
Integer. Random seed for fold assignment (default: 8316951).
List. Parallelization configuration with elements:
plan: Character. One of "multisession", "multicore", "sequential"
workers: Integer. Number of parallel workers
show_message: Logical. Show parallel setup messages
Character. Label for subgroup 0 (default: "Not recommend").
Character. Label for subgroup 1 (default: "Recommend").
Logical. Print progress details (default: FALSE).
Leave-One-Out (LOO): When Kfolds = nrow(df), each
observation is held out once. Most thorough but computationally intensive.
K-Fold: When Kfolds < nrow(df), data is split into K
roughly equal folds. Good balance of bias-variance tradeoff.
The returned resCV data frame contains:
treat.recommend: Prediction from CV model
treat.recommend.original: Prediction from full-data model
cvindex: Fold assignment
sg1, sg2: Subgroup definitions found in each fold
Performs K-fold cross-validation for ForestSearch, evaluating subgroup identification and agreement between training and test sets.
forestsearch for initial subgroup identification
forestsearch_KfoldOut for summarizing CV results
forestsearch_tenfold for repeated K-fold simulations