Learn R Programming

EIX (version 1.2.0)

importance: Importance of variables and interactions in the model

Description

This functions calculates a table with selected measures of importance for variables and interactions.

Usage

importance(xgb_model, data, option = "both", digits = 4)

Arguments

xgb_model

a xgboost or lightgbm model.

data

a data table with data used to train the model.

option

if "variables" then table includes only single variables, if "interactions", then only interactions if "both", then both single variable and interactions. Default "both".

digits

number of significant digits that shall be returned. Will be passed to the signif() functions.

Value

a data table

Details

Available measures:

  • "sumGain" - sum of Gain value in all nodes, in which given variable occurs,

  • "sumCover" - sum of Cover value in all nodes, in which given variable occurs; for LightGBM models: number of observation, which pass through the node,

  • "mean5Gain" - mean gain from 5 occurrences of given variable with the highest gain,

  • "meanGain" - mean Gain value in all nodes, in which given variable occurs,

  • "meanCover" - mean Cover value in all nodes, in which given variable occurs; for LightGBM models: mean number of observation, which pass through the node,

  • "freqency" - number of occurrences in the nodes for given variable.

Additionally for table with single variables:

  • "meanDepth" - mean depth weighted by gain,

  • "numberOfRoots" - number of occurrences in the root,

  • "weightedRoot" - mean number of occurrences in the root, which is weighted by gain.

Examples

Run this code
# NOT RUN {
library("EIX")
library("Matrix")
sm <- sparse.model.matrix(left ~ . - 1,  data = HR_data)

library("xgboost")
param <- list(objective = "binary:logistic", max_depth = 2)
xgb_model <- xgboost(sm, params = param, label = HR_data[, left] == 1, nrounds = 25, verbose=0)

imp <- importance(xgb_model, sm, option = "both")
imp
plot(imp,  top = 10)

imp <- importance(xgb_model, sm, option = "variables")
imp
plot(imp,  top = nrow(imp))

 imp <- importance(xgb_model, sm, option = "interactions")
 imp
plot(imp,  top =  nrow(imp))

 imp <- importance(xgb_model, sm, option = "variables")
 imp
plot(imp, top = NULL, radar = FALSE, xmeasure = "sumCover", ymeasure = "sumGain")


# }

Run the code above in your browser using DataLab