explain_iblm: Explain GLM Model Predictions Using SHAP Values

Description

Creates a list that explains the beta values, and their corrections, of the ensemble IBLM model

Usage

explain_iblm(iblm_model, data, migrate_reference_to_bias = TRUE)

Value

A list containing:

beta_corrected_scatter: Function to create scatter plots showing SHAP corrections vs variable values (see beta_corrected_scatter)
beta_corrected_density: Function to create density plots of SHAP corrections for variables (see beta_corrected_density)
bias_density: Function to create density plots of SHAP corrections migrated to bias (see bias_density)
overall_correction: Function to show global correction distributions (see overall_correction)
shap: Dataframe showing raw SHAP values of data records
beta_corrections: Dataframe showing beta corrections (in wide/one-hot format) of data records
data_beta_coeff: Dataframe showing beta coefficients of data records

Arguments

iblm_model

An object of class 'iblm'. This should be output by `train_iblm_xgb()`

data

Data frame.

If you have used `split_into_train_validate_test()` this will be the "test" portion of your data.

migrate_reference_to_bias

Logical, migrate the beta corrections to the bias for reference levels? This applied to categorical vars only. It is recommended to leave this setting on TRUE

Details

The following outputs are functions that can be called to create plots:

beta_corrected_scatter
beta_corrected_density
bias_density
overall_correction

For each of these, the key data arguments (e.g. data, shap, iblm_model) are already populated by `explain_iblm()`. When calling these functions output from `explain_iblm()` only key settings like variable names, colours...etc need populating.

Examples

Run this code

df_list <- freMTPLmini |> split_into_train_validate_test(seed = 9000)

iblm_model <- train_iblm_xgb(
  df_list,
  response_var = "ClaimRate",
  family = "poisson"
)

ex <- explain_iblm(iblm_model, df_list$test)

# the output contains functions that can be called to visualise iblm
ex$beta_corrected_scatter("DrivAge")
ex$beta_corrected_density("DrivAge")
ex$overall_correction()
ex$bias_density()

# the output contains also dataframes
ex$shap |> dplyr::glimpse()
ex$beta_corrections |> dplyr::glimpse()
ex$data_beta_coeff |> dplyr::glimpse()

Run the code above in your browser using DataLab