xgboost (version 0.81.0.1)

xgb.plot.multi.trees: Project all trees on one tree and plot it

Description

Visualization of the ensemble of trees as a single collective unit.

Usage

xgb.plot.multi.trees(model, feature_names = NULL, features_keep = 5,
  plot_width = NULL, plot_height = NULL, render = TRUE, ...)

Arguments

model

produced by the xgb.train function.

feature_names

names of each feature as a character vector.

features_keep

number of features to keep in each position of the multi trees.

plot_width

width in pixels of the graph to produce

plot_height

height in pixels of the graph to produce

render

a logical flag for whether the graph should be rendered (see Value).

...

currently not used

Value

When render = TRUE: returns a rendered graph object which is an htmlwidget of class grViz. Similar to ggplot objects, it needs to be printed to see it when not running from command line.

When render = FALSE: silently returns a graph object which is of DiagrammeR's class dgr_graph. This could be useful if one wants to modify some of the graph attributes before rendering the graph with render_graph.

Details

This function tries to capture the complexity of a gradient boosted tree model in a cohesive way by compressing an ensemble of trees into a single tree-graph representation. The goal is to improve the interpretability of a model generally seen as black box.

Note: this function is applicable to tree booster-based models only.

It takes advantage of the fact that the shape of a binary tree is only defined by its depth (therefore, in a boosting model, all trees have similar shape).

Moreover, the trees tend to reuse the same features.

The function projects each tree onto one, and keeps for each position the features_keep first features (based on the Gain per feature measure).

This function is inspired by this blog post: https://wellecks.wordpress.com/2015/02/21/peering-into-the-black-box-visualizing-lambdamart/

Examples

Run this code
# NOT RUN {
data(agaricus.train, package='xgboost')

bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 15,
               eta = 1, nthread = 2, nrounds = 30, objective = "binary:logistic",
               min_child_weight = 50, verbose = 0)

p <- xgb.plot.multi.trees(model = bst, features_keep = 3)
print(p)

# }
# NOT RUN {
# Below is an example of how to save this plot to a file.
# Note that for `export_graph` to work, the DiagrammeRsvg and rsvg packages must also be installed.
library(DiagrammeR)
gr <- xgb.plot.multi.trees(model=bst, features_keep = 3, render=FALSE)
export_graph(gr, 'tree.pdf', width=1500, height=600)
# }
# NOT RUN {
# }

Run the code above in your browser using DataCamp Workspace