Learn R Programming

cheem

Interactively explore data- and local explanation- spaces and residual side-by-side. Further explore the support of a selected observation's local explanation with the radial tour.

Context

Local explanations approximate the linear variable importance of a non-linear model in the vicinity of one instance(observation). That is, a point-measure of each variable's importance to the model at the particular location in data-space.

cheem extracts the local explanation of every observation in a dataset, given a model. Given a model, extract the local explanation of every observation in a data set. View the data- and explanation-spaces side-by-side in an interactive shiny application. Further explored a selected point against a comparison using its explanation as a 1D projection basis. A radial tour then explores the structure of explanation projection.

Getting started

## Download the package
install.packages("cheem", dependencies = TRUE)
## May need to restart the R session so RSudio has the correct file structure
rstudioapi::restartSession()
## Load cheem into session
library(cheem)
## Try the app
run_app()

## Processing your data; follow the examples in cheem_ls()
?cheem_ls

Global view

The global view shows data-, attribution-spaces, and residual plot side-by-side with linked brushing and hover tooltip.

By exploring the global view, identify a primary and comparison observation to compare. For the classification task, typically a misclassified point is selected and compared against a nearby correctly classified one. In regression, we can compare a point with an extreme residual with a nearby point that is more accurately predicted.

Radial cheem tour

The attribution of the primary observation becomes the 1D basis for the tour. The variable with the largest difference between the primary and comparison point's bases is selected as the manipulation variable. That is the variable whose contribution change drives the change in the projection basis.

By doing this, we are testing the local explanation. By testing the variable sensitivity to the structure identified in the local explanation, we can better evaluate how good of an explanation it is; how sensitive its prediction is to a change in the variable contributions.

Original application

We started by looking at the model-agnostic local explanation tree SHAP applied to random forests. We made this choice out of concern for runtime (treeshap uses an alternative algorithm with reduced computational complexity and thus achieves much faster run time extracting the full SHAP matrix during the preprocessing step). The namesake, Cheem, stems from the original application to tree-based models in the DALEX ecosystem; Cheem are a fictional race of tree-based humanoids for consistency with the Dr. who/Dr. why theme.

Sources

Package build workflow

  • devtools::document() ## documentation changes
  • pkgdown::build_site() ## packagedown site changes (documentation, vignettes, readme)
  • message("Manually do: Build tab > Install and Restart") ## build package
  • rhub::check_for_cran() ## check package
  • devtools::submit_cran() ## Submit to CRAN

Copy Link

Version

Install

install.packages('cheem')

Monthly Downloads

229

Version

0.4.2

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Nicholas Spyrison

Last Published

September 17th, 2025

Functions in cheem (0.4.2)

sug_basis

Suggest a 1D Basis
radial_cheem_tour

Cheem tour; 1D manual tour on the selected attribution
sug_manip_var

Suggest a manipulation variable
reexports

Objects exported from other packages
problem_type

The type of model for a given Y variable
proto_basis1d_distribution

Adds the distribution of the row local attributions to a ggtour
run_app

Runs a shiny app demonstrating manual tours
subset_cheem

Subset a cheem list
global_view_df_1layer

Create the plot data.frame for the global linked plotly display.
devMessage

Development message
cheem_ls

Preprocessing for use in shiny app
color_scale_of

Suggest a color and fill scale.
as_logical_index

Assure a full length logical index
amesHousing2018

Ames housing data 2018
contains_nonnumeric

Check if a vector contains non-numeric character
ames_rf_pred

Ames random forest model predictions and shap values
cheem

cheem
chocolates_svm_pred

Chocolate svm model predictions and shap values
chocolates

Chocolates dataset
global_view_legwork

The legwork behind the scenes for the global view
ifDev

Evaluate if development
is_diverging

Check if a vector diverges a value
is_discrete

Check if a vector is discrete
model_performance

Extract higher level model performance statistics
linear_tform

Linear function to help set alpha opacity
global_view

Linked plotly display, global view of data and attribution space.
penguin_xgb_pred

Penguins xgb model predictions and shap values
logistic_tform

Logistic function to help set alpha opacity