xgb.model.dt.tree

0th

Percentile

Convert tree model dump to data.table

Read a tree model text dump and return a data.table.

Usage
xgb.model.dt.tree(feature_names = NULL, filename_dump = NULL, model = NULL, text = NULL, n_first_tree = NULL)
Arguments
feature_names
names of each feature as a character vector. Can be extracted from a sparse matrix (see example). If model dump already contains feature names, this argument should be NULL.
filename_dump
the path to the text file storing the model. Model dump must include the gain per feature and per tree (parameter with.stats = T in function xgb.dump).
model
dump generated by the xgb.train function. Avoid the creation of a dump file.
text
dump generated by the xgb.dump function. Avoid the creation of a dump file. Model dump must include the gain per feature and per tree (parameter with.stats = T in function xgb.dump).
n_first_tree
limit the plot to the n first trees. If NULL, all trees of the model are plotted. Performance can be low for huge models.
Details

General function to convert a text dump of tree model to a Matrix. The purpose is to help user to explore the model and get a better understanding of it.

The content of the data.table is organised that way:

  • ID: unique identifier of a node ;
  • Feature: feature used in the tree to operate a split. When Leaf is indicated, it is the end of a branch ;
  • Split: value of the chosen feature where is operated the split ;
  • Yes: ID of the feature for the next node in the branch when the split condition is met ;
  • No: ID of the feature for the next node in the branch when the split condition is not met ;
  • Missing: ID of the feature for the next node in the branch for observation where the feature used for the split are not provided ;
  • Quality: it's the gain related to the split in this specific node ;
  • Cover: metric to measure the number of observation affected by the split ;
  • Tree: ID of the tree. It is included in the main ID ;
  • Yes.X or No.X: data related to the pointer in Yes or No column ;

Value

A data.table of the features used in the model with their gain, cover and few other thing.

Aliases
  • xgb.model.dt.tree
Examples
data(agaricus.train, package='xgboost')

#Both dataset are list with two items, a sparse matrix and labels
#(labels = outcome column which will be learned).
#Each column of the sparse Matrix is a feature in one hot encoding format.
train <- agaricus.train

bst <- xgboost(data = train$data, label = train$label, max.depth = 2,
               eta = 1, nthread = 2, nround = 2,objective = "binary:logistic")

#agaricus.test$data@Dimnames[[2]] represents the column names of the sparse matrix.
xgb.model.dt.tree(agaricus.train$data@Dimnames[[2]], model = bst)
Documentation reproduced from package xgboost, version 0.4-4, License: Apache License (== 2.0) | file LICENSE

Community examples

Looks like there are no examples yet.