process_model: Process and Evaluate a Model Workflow

Description

This function processes a fitted model or a tuning result, finalizes the model if tuning was used, makes predictions on the test set, and computes performance metrics depending on the task type (classification or regression). It supports binary and multiclass classification, and handles probabilistic outputs when supported by the modeling engine.

Usage

process_model(
  model_obj,
  model_id,
  task,
  test_data,
  label,
  event_class,
  start_col = NULL,
  time_col = NULL,
  status_col = NULL,
  engine,
  train_data,
  metric,
  eval_times_user = NULL,
  bootstrap_ci = TRUE,
  bootstrap_samples = 500,
  bootstrap_seed = 1234,
  at_risk_threshold = 0.1
)

Value

A list with two elements:

performance: A tibble with computed performance metrics.
predictions: A tibble with predicted values and corresponding truth values, and probabilities (if applicable).

Arguments

model_obj: A fitted model or a tuning result (`tune_results` object).
model_id: A character identifier for the model (used in warnings).
task: Type of task, either `"classification"`, `"regression"`, or `"survival"`.
test_data: A data frame containing the test data.
label: The name of the outcome variable (as a character string).
event_class: For binary classification, specifies which class is considered the positive class: `"first"` or `"second"`.
start_col: Optional string. The name of the column specifying the start time in counting process (e.g., `(start, stop, event)`) survival data. Only used when task = "survival".
time_col: String. The name of the column specifying the event or censoring time (the "stop" time in counting process data). Only used when task = "survival".
status_col: String. The name of the column specifying the event status (e.g., 0 for censored, 1 for event). Only used when task = "survival".
engine: A character string indicating the model engine (e.g., `"xgboost"`, `"randomForest"`). Used to determine if class probabilities are supported. If `NULL`, probabilities are skipped.
train_data: A data frame containing the training data, required to refit finalized workflows.
metric: The name of the metric (e.g., `"roc_auc"`, `"accuracy"`, `"rmse"`) used for selecting the best tuning result.
eval_times_user: Optional numeric vector of time horizons at which to evaluate survival Brier scores. When `NULL`, sensible defaults based on the observed follow-up distribution are used.
bootstrap_ci: Logical; if `TRUE`, bootstrap confidence intervals are estimated for survival performance metrics.
bootstrap_samples: Integer giving the number of bootstrap resamples used when computing confidence intervals.
bootstrap_seed: Optional integer seed applied before bootstrap resampling to make interval estimates reproducible.
at_risk_threshold: Numeric value between 0 and 1 defining the minimum proportion of subjects required to remain at risk when determining the maximum follow-up time used in survival metrics.

Details

- If the input `model_obj` is a `tune_results` object, the function finalizes the model using the best hyperparameters according to the specified `metric`, and refits the model on the full training data.

- For classification tasks, performance metrics include accuracy, kappa, sensitivity, specificity, precision, F1-score, and ROC AUC (if probabilities are available).

- For regression tasks, RMSE, R-squared, and MAE are returned.

- For models with missing prediction lengths, a helpful imputation error is thrown to guide data preprocessing.