Following the steps of data preprocessing, model fitting, and performance assessment in the MLwrap pipeline,
sensitivity_analysis() processes the training and test data using the preprocessing recipe stored in the
analysis_object, applies the specified SA methods, and stores the results within the analysis_object. It
supports different metrics for evaluation and handles multi-class classification by producing class-specific
analyses and plots, ensuring a comprehensive understanding of model behavior (Iooss & Lemaître, 2015).
As the concluding phase of the MLwrap workflow—after data preparation, model training, and evaluation—this
function enables users to interpret their models by quantifying and visualizing feature importance. It first
validates the input arguments using check_args_sensitivity_analysis()
. Then, it preprocesses the training
and test data using the recipe stored in analysis_object$transformer
. Depending on the specified methods
,
it calculates feature importance using:
PFI (Permutation Feature Importance): Assesses importance by shuffling feature values and measuring
the change in model performance (using the specified or default
metric
).
SHAP (SHapley Additive exPlanations): Computes SHAP values to explain individual predictions by
attributing contributions to each feature.
Integrated Gradients: Evaluates feature importance by integrating gradients of the model's output
with respect to input features.
Olden: Calculates sensitivity based on connection weights, typically for neural network models, to
determine feature contributions.
Sobol_Jansen: Performs variance-based global sensitivity analysis by decomposing the model output variance
into contributions from individual features and their interactions, quantifying how much each
feature and combination of features accounts for the variability in predictions. Only for
continuous outcomes, not for categorical. Specifically, estimates first-order and total-order
Sobol' sensitivity indices simultaneously using the Jansen (1999) Monte Carlo estimator.
For classification tasks with more than two outcome levels, the function generates separate results and plots
for each class. Visualizations include bar plots for importance metrics, box plots for distribution of values,
and beeswarm plots for detailed feature impact across observations. All results are stored in the analysis_object
under the sensitivity_analysis
slot, finalizing the MLwrap pipeline with a deep understanding of model drivers.