Learn R Programming

EIX (version 1.2.0)

plot.importance: Plot importance measures

Description

This functions plots selected measures of importance for variables and interactions. It is possible to visualise importance table in two ways: radar plot with six measures and scatter plot with two choosen measures.

Usage

# S3 method for importance
plot(
  x,
  ...,
  top = 10,
  radar = TRUE,
  text_start_point = 0.5,
  text_size = 3.5,
  xmeasure = "sumCover",
  ymeasure = "sumGain"
)

Arguments

x

a result from the importance function.

...

other parameters.

top

number of positions on the plot or NULL for all variable. Default 10.

radar

TRUE/FALSE. If TRUE the plot shows six measures of variables' or interactions' importance in the model. If FALSE the plot containing two chosen measures of variables' or interactions' importance in the model.

text_start_point

place, where the names of the particular feature start. Available for `radar=TRUE`. Range from 0 to 1. Default 0.5.

text_size

size of the text on the plot. Default 3.5.

xmeasure

measure on the x-axis.Available for `radar=FALSE`. Default "sumCover".

ymeasure

measure on the y-axis. Available for `radar=FALSE`. Default "sumGain".

Value

a ggplot object

Details

Available measures:

  • "sumGain" - sum of Gain value in all nodes, in which given variable occurs,

  • "sumCover" - sum of Cover value in all nodes, in which given variable occurs; for LightGBM models: number of observation, which pass through the node,

  • "mean5Gain" - mean gain from 5 occurrences of given variable with the highest gain,

  • "meanGain" - mean Gain value in all nodes, in which given variable occurs,

  • "meanCover" - mean Cover value in all nodes, in which given variable occurs; for LightGBM models: mean number of observation, which pass through the node,

  • "freqency" - number of occurrences in the nodes for given variable.

Additionally for plots with single variables:

  • "meanDepth" - mean depth weighted by gain,

  • "numberOfRoots" - number of occurrences in the root,

  • "weightedRoot" - mean number of occurrences in the root, which is weighted by gain.

Examples

Run this code
# NOT RUN {
library("EIX")
library("Matrix")
sm <- sparse.model.matrix(left ~ . - 1,  data = HR_data)

library("xgboost")
param <- list(objective = "binary:logistic", max_depth = 2)
xgb_model <- xgboost(sm, params = param, label = HR_data[, left] == 1, nrounds = 25, verbose=0)

imp <- importance(xgb_model, sm, option = "both")
imp
plot(imp,  top = 10)

imp <- importance(xgb_model, sm, option = "variables")
imp
plot(imp,  top = nrow(imp))

 imp <- importance(xgb_model, sm, option = "interactions")
 imp
plot(imp,  top =  nrow(imp))

 imp <- importance(xgb_model, sm, option = "variables")
 imp
plot(imp, top = NULL, radar = FALSE, xmeasure = "sumCover", ymeasure = "sumGain")

# }
# NOT RUN {
library(lightgbm)
train_data <- lgb.Dataset(sm, label =  HR_data[, left] == 1)
params <- list(objective = "binary", max_depth = 2)
lgb_model <- lgb.train(params, train_data, 25)

imp <- importance(lgb_model, sm, option = "both")
imp
plot(imp,  top = nrow(imp))

imp <- importance(lgb_model, sm, option = "variables")
imp
plot(imp, top = NULL, radar = FALSE, xmeasure = "sumCover", ymeasure = "sumGain")

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab