Learn R Programming

fastml (version 0.5.0)

process_model: Process Model and Compute Performance Metrics

Description

Finalizes a tuning result or utilizes an already fitted workflow to generate predictions on test data and compute performance metrics.

Usage

process_model(model_obj, model_id, task, test_data, label, event_class, engine)

Value

A list with two components:

performance

A data frame of performance metrics. For classification tasks, metrics include accuracy, kappa, sensitivity, specificity, precision, F-measure, and ROC AUC (when applicable). For regression tasks, metrics include RMSE, R-squared, and MAE.

predictions

A data frame containing the test data augmented with predicted classes and, when applicable, predicted probabilities.

Arguments

model_obj

A model object, which can be either a tuning result (an object inheriting from "tune_results") or an already fitted workflow.

model_id

A unique identifier for the model, used in warning messages if issues arise during processing.

task

A character string indicating the type of task, either "classification" or "regression".

test_data

A data frame containing the test data on which predictions will be generated.

label

A character string specifying the name of the outcome variable in test_data.

event_class

For classification tasks, a character string specifying which event class to consider as positive (accepted values: "first" or "second").

engine

A character string specifying the modeling engine used. This parameter affects prediction types and metric computations.

Details

The function first checks if model_obj is a tuning result. If so, it attempts to:

If model_obj is already a fitted workflow, it is used directly.

For classification tasks, the function makes class predictions (and probability predictions if engine is not "LiblineaR") and computes performance metrics using functions from the yardstick package. In binary classification, the positive class is determined based on the event_class argument and ROC AUC is computed accordingly. For multiclass classification, macro-averaged metrics and ROC AUC (using weighted estimates) are calculated.

For regression tasks, the function predicts outcomes and computes regression metrics (RMSE, R-squared, and MAE).

If the number of predictions does not match the number of rows in test_data, the function stops with an informative error message regarding missing values and imputation options.