Learn R Programming

fastml (version 0.7.7)

process_model: Process and Evaluate a Model Workflow

Description

This function processes a fitted model or a tuning result, finalizes the model if tuning was used, makes predictions on the test set, and computes performance metrics depending on the task type (classification or regression). It supports binary and multiclass classification, and handles probabilistic outputs when supported by the modeling engine.

Usage

process_model(
  model_obj,
  model_id,
  task,
  test_data,
  label,
  event_class,
  class_threshold = "auto",
  start_col = NULL,
  time_col = NULL,
  status_col = NULL,
  engine,
  train_data,
  metric,
  eval_times_user = NULL,
  bootstrap_ci = TRUE,
  bootstrap_samples = 500,
  bootstrap_seed = 1234,
  at_risk_threshold = 0.1,
  survival_metric_convention = "fastml",
  metrics = NULL,
  summaryFunction = NULL,
  precomputed_predictions = NULL,
  multiclass_auc = "macro"
)

Value

A list with two elements:

performance

A tibble with computed performance metrics.

predictions

A tibble with predicted values and corresponding truth values, and probabilities (if applicable).

Arguments

model_obj

A fitted model or a tuning result (`tune_results` object).

model_id

A character identifier for the model (used in warnings).

task

Type of task, either `"classification"`, `"regression"`, or `"survival"`.

test_data

A data frame containing the test data.

label

The name of the outcome variable (as a character string).

event_class

For binary classification, specifies which class is considered the positive class: `"first"` or `"second"`.

class_threshold

For binary classification, controls how class probabilities are converted into hard class predictions. Numeric values in (0, 1) set a fixed threshold. The default `"auto"` tunes a threshold on the training data to maximize F1; use `"model"` to keep the model's default threshold.

start_col

Optional string. The name of the column specifying the start time in counting process (e.g., `(start, stop, event)`) survival data. Only used when task = "survival".

time_col

String. The name of the column specifying the event or censoring time (the "stop" time in counting process data). Only used when task = "survival".

status_col

String. The name of the column specifying the event status (e.g., 0 for censored, 1 for event). Only used when task = "survival".

engine

A character string indicating the model engine (e.g., `"xgboost"`, `"randomForest"`). Used to determine if class probabilities are supported. If `NULL`, probabilities are skipped.

train_data

A data frame containing the training data, required to refit finalized workflows.

metric

The name of the metric (e.g., `"roc_auc"`, `"accuracy"`, `"rmse"`) used for selecting the best tuning result.

eval_times_user

Optional numeric vector of time horizons at which to evaluate survival Brier scores. When `NULL`, sensible defaults based on the observed follow-up distribution are used.

bootstrap_ci

Logical; if `TRUE`, bootstrap confidence intervals are estimated for performance metrics.

bootstrap_samples

Integer giving the number of bootstrap resamples used when computing confidence intervals.

bootstrap_seed

Optional integer seed applied before bootstrap resampling to make interval estimates reproducible.

at_risk_threshold

Numeric value between 0 and 1 defining the minimum proportion of subjects required to remain at risk when determining the maximum follow-up time used in survival metrics.

survival_metric_convention

Character string specifying which survival metric conventions to follow. `"fastml"` (default) uses fastml's internal defaults for evaluation horizons and t_max. `"tidymodels"` uses `eval_times_user` as the explicit evaluation grid and applies yardstick-style Brier/IBS normalization; when `eval_times_user` is `NULL`, time-dependent Brier metrics are omitted.

metrics

Optional yardstick metric set (e.g., `yardstick::metric_set(yardstick::rmse)`) used for computing regression performance.

summaryFunction

Optional custom classification metric function passed to `yardstick::new_class_metric()` and included in holdout evaluation.

precomputed_predictions

Optional data frame or nested list of previously generated predictions (per algorithm/engine) to reuse instead of re-predicting; primarily used when combining results across engines.

multiclass_auc

For multiclass ROC AUC, the averaging method to use: `"macro"` (default, tidymodels) or `"macro_weighted"`. Macro weights each class equally, while macro_weighted weights by class prevalence and can change model rankings on imbalanced data.

Details

- If the input `model_obj` is a `tune_results` object, the function finalizes the model using the best hyperparameters according to the specified `metric`, and refits the model on the full training data.

- For classification tasks, performance metrics include accuracy, kappa, sensitivity, specificity, precision, F1-score, and ROC AUC (if probabilities are available).

- For multiclass ROC AUC, the estimator is controlled by `multiclass_auc`.

- For regression tasks, RMSE, R-squared, and MAE are returned.

- For models with missing prediction lengths, a helpful imputation error is thrown to guide data preprocessing.