PartialDependence: Partial Dependence Plot

Description

PartialDependence computes and plots partial dependence functions of prediction models.

Format

R6Class object.

Usage

pdp = PartialDependence$new(predictor, feature, grid.size = 20, run = TRUE)

plot(pdp) pdp$results print(pdp) pdp$set.feature(2)

Arguments

For PartialDependence$new():

predictor:: (Predictor) The object (created with Predictor$new()) holding the machine learning model and the data.
feature: The feature name or index for which to compute the partial dependencies. Either a single number or vector of two numbers.
grid.size: The size of the grid for evaluating the predictions
run: logical. Should the Interpretation method be run?

Fields

feature.index: The index of the features for which the partial dependence was computed.
feature.name: The names of the features for which the partial dependence was computed.
feature.type: The detected types of the features, either "categorical" or "numerical".
grid.size: The size of the grid.
n.features: The number of features (either 1 or 2)
predictor: The prediction model that was analysed.
results: data.frame with the grid of feature of interest and the predicted $\hat{y}$. Can be used for creating custom partial dependence plots.

Methods

set.feature: method to get/set feature(s) (by index) fpr which to compute pdp. See examples for usage.
plot(): method to plot the partial dependence function. See plot.PartialDependence
run(): [internal] method to run the interpretability method. Use obj$run(force = TRUE) to force a rerun.
clone(): [internal] method to clone the R6 object.
initialize(): [internal] method to initialize the R6 object.

Details

The partial dependence plot calculates and plots the dependence of f(X) on a single or two features. To learn more about partial dependence plot, read the Interpretable Machine Learning book: https://christophm.github.io/interpretable-ml-book/pdp.html

References

Friedman, J.H. 2001. "Greedy Function Approximation: A Gradient Boosting Machine." Annals of Statistics 29: 1189-1232.

Examples

Run this code

# NOT RUN {
# We train a random forest on the Boston dataset:
if (require("randomForest")) {
data("Boston", package  = "MASS")
rf = randomForest(medv ~ ., data = Boston, ntree = 50)
mod = Predictor$new(rf, data = Boston)

# Compute the partial dependence for the first feature
pdp.obj = PartialDependence$new(mod, feature = "crim")

# Plot the results directly
plot(pdp.obj)

# Since the result is a ggplot object, you can extend it: 
if (require("ggplot2")) {
 plot(pdp.obj) + theme_bw()
}

# If you want to do your own thing, just extract the data: 
pdp.dat = pdp.obj$results
head(pdp.dat)

# You can reuse the pdp object for other features: 
pdp.obj$set.feature("lstat")
plot(pdp.obj)

# Partial dependence plots support up to two features: 
pdp.obj = PartialDependence$new(mod, feature = c("crim", "lstat"))  
plot(pdp.obj)

# Partial dependence plots also works with multiclass classification
rf = randomForest(Species ~ ., data = iris, ntree=50)
predict.fun = function(object, newdata) predict(object, newdata, type = "prob")
mod = Predictor$new(rf, data = iris, predict.fun = predict.fun)

# For some models we have to specify additional arguments for the predict function
plot(PartialDependence$new(mod, feature = "Sepal.Length"))

# Partial dependence plots support up to two features: 
pdp.obj = PartialDependence$new(mod, feature = c("Sepal.Length", "Petal.Length"))
pdp.obj$plot()   

# For multiclass classification models, you can choose to only show one class:
mod = Predictor$new(rf, data = iris, predict.fun = predict.fun, class = 1)
plot(PartialDependence$new(mod, feature = "Sepal.Length"))
}
# }

Run the code above in your browser using DataLab