Learn R Programming

SHAPforxgboost (version 0.0.2)

shap.plot.summary.wrap2: A wrapped function to make summary plot from given SHAP values matrix

Description

Sometimes the SHAP matrix is returned from cross-validation. This function wraps up function shap.prep and shap.plot.summary.

Usage

shap.plot.summary.wrap2(shap_score, X, top_n, dilute = FALSE)

Arguments

shap_score

the SHAP values dataset, could be obtained by shap.prep.

X

the dataset of predictors used for the xgboost model

top_n

how many predictors you want to show in the plot (ranked)

dilute

a number or logical, dafault to TRUE, will plot nrow(data_long)/dilute data. for example, if dilute = 5 will plot 1/5 of the data. If dilute = TRUE or a number, we will plot at most half points per feature, so the plot won't be too slow. If you put dilute too high, at least 10 points per feature would be kept. If the dataset is even smaller than that, will just plot all the data.

Details

If a global list named new_labels is provided (!is.null(new_labels), the plots will use that list to replace default labels labels_within_package.

Examples

Run this code
# NOT RUN {
data("iris")
X1 = as.matrix(iris[,-5])
mod1 = xgboost::xgboost(
  data = X1, label = iris$Species, gamma = 0, eta = 1,
  lambda = 0,nrounds = 1, verbose = FALSE)


# shap.values(model, X_dataset) returns the SHAP
# data matrix and ranked features by mean|SHAP|
shap_values <- shap.values(xgb_model = mod1, X_train = X1)
shap_values$mean_shap_score
shap_values_iris <- shap_values$shap_score

# shap.prep() returns the long-format SHAP data from either model or
shap_long_iris <- shap.prep(xgb_model = mod1, X_train = X1)
# is the same as: using given shap_contrib
shap_long_iris <- shap.prep(shap_contrib = shap_values_iris, X_train = X1)

# **SHAP summary plot**
shap.plot.summary(shap_long_iris, scientific = TRUE)
shap.plot.summary(shap_long_iris, x_bound  = 1.5, dilute = 10)

# Alternatives options to make the same plot:
# option 1: from the xgboost model
shap.plot.summary.wrap1(mod1, X = as.matrix(iris[,-5]), top_n = 3)

# option 2: supply a self-made SHAP values dataset
# (e.g. sometimes as output from cross-validation)
shap.plot.summary.wrap2(shap_values_iris, X1, top_n = 3)
# }

Run the code above in your browser using DataLab