pdp: Partial Dependence

Description

pdp() computes partial dependence functions of prediction models.

Usage

pdp(object, X, feature, grid.size = 10, class = NULL, ...)

Arguments

object

The machine learning model. Different types are allowed. Recommended are mlr WrappedModel and caret train objects. The object can also be a function that predicts the outcome given features or anything with an S3 predict function, like an object from class lm.

data.frame with the data for the prediction model

feature

The feature index for which to compute the partial dependencies. Either a single number or vector of two numbers

grid.size

The size of the grid for evaluating the predictions

class

In case of classification, class specifies the class for which to predict the probability. By default the multiclass classification is done.

...

Further arguments for the prediction method.

Value

A PDP object (R6). Its methods and variables can be accessed with the $-operator:

feature

The feature names for which the partial dependence was computed.

feature.type

The detected types of the features, either "categorical" or "numerical".

feature.index

The index of the features for which the partial dependence was computed.

grid.size

The size of the grid.

sample.size

The number of instances sampled from data X.

feature(index)

method to get/set feature(s) (by index) fpr which to compute pdp. See examples for usage.

data()

method to extract the results of the partial dependence plot. Returns a data.frame with the grid of feature of interest and the predicted $\hat{y}$. Can be used for creating custom partial dependence plots.

plot()

method to plot the partial dependence function. See plot.PDP

run()

[internal] method to run the interpretability method. Use obj$run(force = TRUE) to force a rerun.

General R6 methods

clone()

[internal] method to clone the R6 object.

initialize()

[internal] method to initialize the R6 object.

Details

Machine learning model try to learn the relationship $y = f(X)$. We can't visualize the learned $\hat{f}$ directly for high-dimensional X. But we can split it into parts: $$f(X) = f_1(X_1) + \ldots + f_p(X_p) + f_{1, 2}(X_1, X_2) + \ldots + f_{p-1, p}(X_{p-1}, X_p) + \ldots + f_{1\ldots p}(X_1\ldots X_p)$$,

And we can isolate the partial dependence of $y$ on a single $X_j$: $f_j(X_j)$ and plot it. We can even do this for higher dimensions, but a maximum of 2 features makes sense: $f_j(X_j) + f_k(X_k) + f_{jk}(X_{jk})$

The partial dependence for a feature $X_j$ is estimated by spanning a grid over the feature space. For each value of the grid, we replace in the whole dataset the $X_j$-value with the grid value, predict the outcomes $\hat{y}$ with the machine learning models and average the predictions. This generate one point of the partial dependence curve. After doing this for the whole grid, the outcome is a curve (or 2D plane), that then can be plotted.

To learn more about partial dependence plot, read the Interpretable Machine Learning book: https://christophm.github.io/interpretable-ml-book/pdp.html

References

Friedman, J.H. 2001. "Greedy Function Approximation: A Gradient Boosting Machine." Annals of Statistics 29: 1189-1232.

Examples

Run this code

# NOT RUN {
# We train a random forest on the Boston dataset:
library("randomForest")
data("Boston", package  = "MASS")
mod = randomForest(medv ~ ., data = Boston, ntree = 50)

# Compute the partial dependence for the first feature
pdp.obj = pdp(mod, Boston, feature = 1)

# Plot the results directly
plot(pdp.obj)


# Since the result is a ggplot object, you can extend it: 
library("ggplot2")
plot(pdp.obj) + theme_bw()

# If you want to do your own thing, just extract the data: 
pdp.dat = pdp.obj$data()
head(pdp.dat)

# You can reuse the pdp object for other features: 
pdp.obj$feature = 2
plot(pdp.obj)

# Partial dependence plots support up to two features: 
pdp.obj = pdp(mod, Boston, feature = c(1,2))  

# Partial dependence plots also works with multiclass classification
library("randomForest")
mod = randomForest(Species ~ ., data= iris, ntree=50)

# For some models we have to specify additional arguments for the predict function
plot(pdp(mod, iris, feature = 1, predict.args = list(type = 'prob')))

# For multiclass classification models, you can choose to only show one class:
plot(pdp(mod, iris, feature = 1, class = 1, predict.args = list(type = 'prob')))

# Partial dependence plots support up to two features: 
pdp.obj = pdp(mod, iris, feature = c(1,3), predict.args = list(type = 'prob'))
pdp.obj$plot()  

# }

Run the code above in your browser using DataLab