process_model: Process Model and Compute Performance Metrics

Description

Finalizes a tuning result or utilizes an already fitted workflow to generate predictions on test data and compute performance metrics.

Usage

process_model(
  model_obj,
  model_id,
  task,
  test_data,
  label,
  event_class,
  engine,
  train_data,
  metric
)

Value

A list with two components:

performance: A data frame of performance metrics. For classification tasks, metrics include accuracy, kappa, sensitivity, specificity, precision, F-measure, and ROC AUC (when applicable). For regression tasks, metrics include RMSE, R-squared, and MAE.
predictions: A data frame containing the test data augmented with predicted classes and, when applicable, predicted probabilities.

Arguments

model_obj: A model object, which can be either a tuning result (an object inheriting from "tune_results") or an already fitted workflow.
model_id: A unique identifier for the model, used in warning messages if issues arise during processing.
task: A character string indicating the type of task, either "classification" or "regression".
test_data: A data frame containing the test data on which predictions will be generated.
label: A character string specifying the name of the outcome variable in test_data.
event_class: For classification tasks, a character string specifying which event class to consider as positive (accepted values: "first" or "second").
engine: A character string specifying the modeling engine used. This parameter affects prediction types and metric computations.
train_data: A data frame containing the training data used to fit tuned models.
metric: A character string specifying the metric name used to select the best tuning parameters.

Details

The function first checks if model_obj is a tuning result. If so, it attempts to:

Select the best tuning parameters using tune::select_best (note that the metric used for selection should be defined in the calling environment).
Extract the model specification and preprocessor from model_obj using workflows::pull_workflow_spec and workflows::pull_workflow_preprocessor, respectively.
Finalize the model specification with the selected parameters via tune::finalize_model.
Rebuild the workflow using workflows::workflow, workflows::add_recipe, and workflows::add_model, and fit the finalized workflow with parsnip::fit on the supplied train_data.

If model_obj is already a fitted workflow, it is used directly.

For classification tasks, the function makes class predictions (and probability predictions if engine is not "LiblineaR") and computes performance metrics using functions from the yardstick package. In binary classification, the positive class is determined based on the event_class argument and ROC AUC is computed accordingly. For multiclass classification, macro-averaged metrics and ROC AUC (using weighted estimates) are calculated.

For regression tasks, the function predicts outcomes and computes regression metrics (RMSE, R-squared, and MAE).

If the number of predictions does not match the number of rows in test_data, the function stops with an informative error message regarding missing values and imputation options.