fastexplain: Explain a fastml model using various techniques

Description

Provides model explainability. When `method = "dalex"` this function:

Creates a DALEX explainer.
Computes permutation-based variable importance with boxplots showing variability, displays the table and plot.
Computes partial dependence-like model profiles if `features` are provided.
Computes Shapley values (SHAP) for a sample of the training observations, displays the SHAP table, and plots a summary bar chart of \(\text{mean}(\vert \text{SHAP value} \vert)\) per feature. For classification, it shows separate bars for each class.

Usage

fastexplain(
  object,
  method = "dalex",
  features = NULL,
  observation = NULL,
  grid_size = 20,
  shap_sample = 5,
  vi_iterations = 10,
  seed = 123,
  loss_function = NULL,
  ...
)

Value

Prints DALEX explanations: variable importance table & plot, model profiles (if any), and SHAP table & summary plot.

Arguments

object

A fastml object.

method

Character string specifying the explanation method. Supported values are "dalex", "lime", "ice", "ale", "surrogate", "interaction", and "counterfactual". Defaults to "dalex".

features

Character vector of feature names for partial dependence (model profiles). Default NULL.

observation

A single observation for counterfactual explanations. Default NULL.

grid_size

Number of grid points for partial dependence. Default 20.

shap_sample

Integer number of observations from processed training data to compute SHAP values for. Default 5.

vi_iterations

Integer. Number of permutations for variable importance (B). Default 10.

seed

Integer. A value specifying the random seed.

loss_function

Function. The loss function for model_parts.

If NULL and task = 'classification', defaults to DALEX::loss_cross_entropy.
If NULL and task = 'regression', defaults to DALEX::loss_root_mean_square.

...

Additional arguments passed to the underlying helper functions.

Details

Custom number of permutations for VI (vi_iterations):

You can now specify how many permutations (B) to use for permutation-based variable importance. More permutations yield more stable estimates but take longer.
Better error messages and checks:

Improved checks and messages if certain packages or conditions are not met.
Loss Function:

A loss_function argument has been added to let you pick a different performance measure (e.g., loss_cross_entropy for classification, loss_root_mean_square for regression).
Parallelization Suggestion: