Learn R Programming

shapr (version 0.1.3)

shapr: Create an explainer object with Shapley weights for test data.

Description

Create an explainer object with Shapley weights for test data.

Usage

shapr(x, model, n_combinations = NULL, feature_labels = NULL)

Arguments

x

Numeric matrix or data.frame. Contains the data used for training the model.

model

The model whose predictions we want to explain. See predict_model for more information about which models shapr supports natively.

n_combinations

Integer. The number of feature combinations to sample. If NULL, the exact method is used and all combinations are considered. The maximum number of combinations equals 2^ncol(x).

feature_labels

Character vector. The labels/names of the features used for training the model. Only applicable if you are using a custom model. Otherwise the features in use are extracted from model.

Value

Named list that contains the following items:

exact

Boolean. Equals TRUE if n_combinations = NULL or n_combinations < 2^ncol(x), otherwise FALSE.

n_features

Positive integer. The number of columns in x

model_type

Character. Returned value after calling model_type(model)

S

Binary matrix. The number of rows equals the number of unique combinations, and the number of columns equals the total number of features. I.e. let's say we have a case with three features. In that case we have 2^3 = 8 unique combinations. If the j-th observation for the i-th row equals 1 it indicates that the j-th feature is present in the i-th combination. Otherwise it equals 0.

W

Second item

X

data.table. Returned object from feature_combinations

x_train

data.table. Transformed x into a data.table.

In addition to the items above model, feature_labels (updated with the names actually used by the model) and n_combinations is also present in the returned object.

Examples

Run this code
# NOT RUN {
# Load example data
data("Boston", package = "MASS")
df <- Boston

# Example using the exact method
x_var <- c("lstat", "rm", "dis", "indus")
y_var <- "medv"
df1 <- df[, x_var]
model <- lm(medv ~ lstat + rm + dis + indus, data = df)
explainer <- shapr(df1, model)

print(nrow(explainer$X))
# 16 (which equals 2^4)

# Example using approximation
y_var <- "medv"
x_var <- setdiff(colnames(df), y_var)
model <- lm(medv ~ ., data = df)
df2 <- df[, x_var]
explainer <- shapr(df2, model, n_combinations = 1e3)

print(nrow(explainer$X))

# Example using approximation where n_combinations > 2^m
x_var <- c("lstat", "rm", "dis", "indus")
y_var <- "medv"
df3 <- df[, x_var]
model <- lm(medv ~ lstat + rm + dis + indus, data = df)
explainer <- shapr(df1, model, n_combinations = 1e3)

print(nrow(explainer$X))
# 16 (which equals 2^4)
# }

Run the code above in your browser using DataLab