local_attributions: Model Agnostic Sequential Variable attributions

Description

This function finds Variable attributions via Sequential Variable Conditioning. The complexity of this function is O(2*p). This function works in a similar way to step-up and step-down greedy approximations in function break_down. The main difference is that in the first step the order of variables is determined. And in the second step the impact is calculated.

Usage

local_attributions(x, ...)
# S3 method for explainer
local_attributions(x, new_observation, keep_distributions = FALSE, ...)
# S3 method for default
local_attributions(
  x,
  data,
  predict_function = predict,
  new_observation,
  label = class(x)[1],
  keep_distributions = FALSE,
  order = NULL,
  ...
)

Arguments

an explainer created with function explain or a model.

...

other parameters.

new_observation

a new observation with columns that correspond to variables used in the model.

keep_distributions

if TRUE, then distribution of partial predictions is stored and can be plotted with the generic plot().

data

validation dataset, will be extracted from x if it is an explainer.

predict_function

predict function, will be extracted from x if it is an explainer.

label

name of the model. By default it's extracted from the 'class' attribute of the model.

order

if not NULL, then it will be a fixed order of variables. It can be a numeric vector or vector with names of variables.

Value

an object of the break_down class.

References

Explanatory Model Analysis. Explore, Explain and Examine Predictive Models. https://pbiecek.github.io/ema

Examples

Run this code

# NOT RUN {
library("DALEX")
library("iBreakDown")
set.seed(1313)
model_titanic_glm <- glm(survived ~ gender + age + fare,
                       data = titanic_imputed, family = "binomial")
explain_titanic_glm <- explain(model_titanic_glm,
                           data = titanic_imputed,
                           y = titanic_imputed$survived,
                           label = "glm")

bd_glm <- local_attributions(explain_titanic_glm, titanic_imputed[1, ])
bd_glm
plot(bd_glm, max_features = 3)

# }
# NOT RUN {
## Not run:
library("randomForest")
set.seed(1313)
# example with interaction
# classification for HR data
model <- randomForest(status ~ . , data = HR)
new_observation <- HR_test[1,]

explainer_rf <- explain(model,
                        data = HR[1:1000,1:5])

bd_rf <- local_attributions(explainer_rf,
                           new_observation)
bd_rf
plot(bd_rf)
plot(bd_rf, baseline = 0)

# example for regression - apartment prices
# here we do not have interactions
model <- randomForest(m2.price ~ . , data = apartments)
explainer_rf <- explain(model,
                        data = apartments_test[1:1000,2:6],
                        y = apartments_test$m2.price[1:1000])

bd_rf <- local_attributions(explainer_rf,
                           apartments_test[1,])
bd_rf
plot(bd_rf, digits = 1)

bd_rf <- local_attributions(explainer_rf,
                           apartments_test[1,],
                           keep_distributions = TRUE)
plot(bd_rf, plot_distributions = TRUE)
# }