Learn R Programming

pre (version 0.3.0)

importance: Calculate importances of baselearners (rules and linear terms) and input variables

Description

importance calculates importances for rules, linear terms and input variables in the ensemble, and provides a bar plot of variable importances.

Usage

importance(object, standardize = FALSE, global = TRUE,
  quantprobs = c(0.75, 1), penalty.par.val = "lambda.1se", round = NA,
  plot = TRUE, ylab = "Importance", main = "Variable importances",
  diag.xlab = TRUE, diag.xlab.hor = 0, diag.xlab.vert = 2, cex.axis = 1,
  ...)

Arguments

object

an object of class pre

standardize

logical. Should baselearner importances be standardized with respect to the outcome variable? If TRUE, baselearner importances have a minimum of 0 and a maximum of 1. Only used for ensembles with numeric (non-count) response variables.

global

logical. Should global importances be calculated? If FALSE, local importances will be calculated, given the quantiles of the predictions F(x) in quantprobs.

quantprobs

optional numeric vector of length two. Only used when global = FALSE. Probabilities for calculating sample quantiles of the range of F(X), over which local importances are calculated. The default provides variable importances calculated over the 25% highest values of F(X).

penalty.par.val

character. Should model be selected with lambda yielding minimum cv error ("lambda.min"), or lambda giving cv error that is within 1 standard error of minimum cv error ("lambda.1se")? Alternatively, a numeric value may be specified, corresponding to one of the values of lambda in the sequence used by glmnet.

round

integer. Number of decimal places to round numeric results to. If NA (default), no rounding is performed.

plot

logical. Should variable importances be plotted?

ylab

character string. Plotting label for y-axis. Only used when plot = TRUE.

main

character string. Main title of the plot. Only used when plot = TRUE.

diag.xlab

logical. Should variable names be printed diagonally (that is, in a 45 degree angle)? Alternatively, variable names may be printed vertically by specifying diag.xlab = FALSE, las = 2.

diag.xlab.hor

numeric. Horizontal adjustment for lining up variable names with bars in the plot if variable names are printed diagonally.

diag.xlab.vert

positive integer. Vertical adjustment for position of variable names, if printed diagonally. Corresponds to the number of character spaces added after variable names.

cex.axis

numeric. The magnification to be used for axis annotation relative to the current setting of cex.

...

further arguments to be passed to barplot (only used when plot = TRUE).

Value

A list with two dataframes: $baseimps, giving the importances for baselearners in the ensemble, and $varimps, giving the importances for all predictor variables.

Examples

Run this code
# NOT RUN {
set.seed(42)
airq.ens <- pre(Ozone ~ ., data = airquality[complete.cases(airquality),])
# calculate global importances:
importance(airq.ens)
# calculate local importances (default: over 25% highest predicted values):
importance(airq.ens, global = FALSE)
# calculate local importances (custom: over 25% lowest predicted values):
importance(airq.ens, global = FALSE, quantprobs = c(0, .25))
# }

Run the code above in your browser using DataLab