Learn R Programming

bartXViz (version 1.0.8)

Explain.wbart: Approximate Shapley Values Computed from a BART Model Fitted using wbart or gbart

Description

Explain.wbart function is used to calculate the contribution of each variable in the Bayesian Additive Regression Trees (BART) model using permutation. It is used to compute the Shapley values of models estimated using the wbart or gbart functions from BART.

Usage

# S3 method for wbart
Explain(
  object,
  feature_names = NULL,
  X = NULL,
  nsim = 1,
  pred_wrapper = NULL,
  newdata = NULL,
  parallel = FALSE,
  ...
)

Value

Returns of class ExplainBART with consisting of a list with the following components:

phis

A list containing the Shapley values for each variable.

newdata

The data used to check the contribution of variables. If a variable has categories, categorical variables are one-hot encoded.

fnull

The expected value of the model's predictions.

fx

The prediction value for each observation.

factor_names

The name of the categorical variable. If the data contains only continuous or dummy variables, it is set to NULL.

Arguments

object

A BART model (Bayesian Additive Regression Tree) estimated using the bart function from the dbarts.

feature_names

The name of the variable for which you want to check the contribution. The default value is set to NULL, which means the contribution of all variables in X will be calculated.

X

The dataset containing all independent variables used as input when estimating the BART model.

nsim

The number of Monte Carlo repetitions used for estimating each Shapley value is set to 1 by default for the BART model.

pred_wrapper

A function used to estimate the predicted values of the model.

newdata

New data containing the variables included in the model. This is used when checking the contribution of newly input data using the model. The default value is set to NULL, meaning that the input X data, i.e., the data used for model estimation, will be used by default.

parallel

The default value is set to FALSE, but it can be changed to TRUE for parallel computation.

...

Additional arguments to be passed

Examples

Run this code
# \donttest{
## Friedman data
set.seed(2025)
n <- 200
p <- 5
X <- data.frame(matrix(runif(n * p), ncol = p))
y <- 10 * sin(pi* X[ ,1] * X[,2]) +20 * (X[,3] -.5)^2 + 10 * X[ ,4] + 5 * X[,5] + rnorm(n)

## Using the BART 
model <- BART::wbart(X,y,ndpost=200)
## prediction wrapper function
pfun <- function(object, newdata) {
       predict(object , newdata)
       }
       
## Calculate shapley values
model_exp <-  Explain  ( model, X = X,  pred_wrapper =  pfun )
# }

Run the code above in your browser using DataLab