garson: Variable importance for neural networks

Description

Relative importance of input variables in neural networks using Garson's algorithm

Usage

garson(mod_in, out_var, ...)

## S3 method for class 'numeric':
garson(mod_in, out_var, struct, bar_plot = TRUE,
  x_lab = NULL, y_lab = NULL, wts_only = FALSE, ...)

## S3 method for class 'nnet':
garson(mod_in, out_var, bar_plot = TRUE, x_lab = NULL,
  y_lab = NULL, wts_only = FALSE, ...)

## S3 method for class 'mlp':
garson(mod_in, out_var, bar_plot = TRUE, x_lab = NULL,
  y_lab = NULL, wts_only = FALSE, ...)

## S3 method for class 'nn':
garson(mod_in, out_var, bar_plot = TRUE, x_lab = NULL,
  y_lab = NULL, wts_only = FALSE, ...)

## S3 method for class 'train':
garson(mod_in, out_var, bar_plot = TRUE, x_lab = NULL,
  y_lab = NULL, wts_only = FALSE, ...)

Arguments

mod_in

input object for which an organized model list is desired. The input can be an object of class numeric, nnet, mlp, or nn

out_var

chr string indicating the response variable in the neural network object to be evaluated. Only one input is allowed for models with more than one response. Names must be of the form 'Y1', 'Y2', etc. if using numeric values as w

...

arguments passed to other methods

struct

numeric vector equal in length to the number of layers in the network. Each number indicates the number of nodes in each layer starting with the input and ending with the output. An arbitrary number of hidden layers can be included.

bar_plot

logical indicating if a ggplot object is returned (default T), otherwise numeric values are returned

x_lab

chr string of alternative names to be used for explanatory variables in the figure, default is taken from mod_in

y_lab

chr string of alternative names to be used for response variable in the figure, default is taken from out_var

wts_only

logical passed to neuralweights, default FALSE

Value

A ggplot object for plotting if bar_plot = FALSE, otherwise a data.frame of relative importance values for each input variable.

Details

The weights that connect variables in a neural network are partially analogous to parameter coefficients in a standard regression model and can be used to describe relationships between variables. The weights dictate the relative influence of information that is processed in the network such that input variables that are not relevant in their correlation with a response variable are suppressed by the weights. The opposite effect is seen for weights assigned to explanatory variables that have strong, positive associations with a response variable. An obvious difference between a neural network and a regression model is that the number of weights is excessive in the former case. This characteristic is advantageous in that it makes neural networks very flexible for modeling non-linear functions with multiple interactions, although interpretation of the effects of specific variables is of course challenging. A method described in Garson 1991 (also see Goh 1995) identifies the relative importance of explanatory variables for specific response variables in a supervised neural network by deconstructing the model weights. The basic idea is that the relative importance (or strength of association) of a specific explanatory variable for a specific response variable can be determined by identifying all weighted connections between the nodes of interest. That is, all weights connecting the specific input node that pass through the hidden layer to the specific response variable are identified. This is repeated for all other explanatory variables until the analyst has a list of all weights that are specific to each input variable. The connections are tallied for each input node and scaled relative to all other inputs. A single value is obtained for each explanatory variable that describes the relationship with response variable in the model (see the appendix in Goh 1995 for a more detailed description). The original algorithm presented in Garson 1991 indicated relative importance as the absolute magnitude from zero to one such the direction of the response could not be determined.

References

Garson, G.D. 1991. Interpreting neural network connection weights. Artificial Intelligence Expert. 6(4):46-51. Goh, A.T.C. 1995. Back-propagation neural networks for modeling complex systems. Artificial Intelligence in Engineering. 9(3):143-151. Olden, J.D., Jackson, D.A. 2002. Illuminating the 'black-box': a randomization approach for understanding variable contributions in artificial neural networks. Ecological Modelling. 154:135-150.

Examples

Run this code

## using numeric input

wts_in <- c(13.12, 1.49, 0.16, -0.11, -0.19, -0.16, 0.56, -0.52, 0.81)
struct <- c(2, 2, 1) #two inputs, two hidden, one output

garson(wts_in, 'Y1', struct)

## using nnet

library(nnet)

data(neuraldat)
set.seed(123)

mod <- nnet(Y1 ~ X1 + X2 + X3, data = neuraldat, size = 5)

garson(mod, 'Y1')

## using RSNNS, no bias layers

library(RSNNS)

x <- neuraldat[, c('X1', 'X2', 'X3')]
y <- neuraldat[, 'Y1']
mod <- mlp(x, y, size = 5)

garson(mod, 'Y1')

## using neuralnet

library(neuralnet)

mod <- neuralnet(Y1 ~ X1 + X2 + X3, data = neuraldat, hidden = 5)

garson(mod, 'Y1')

## using caret

library(caret)

mod <- train(Y1 ~ X1 + X2 + X3, method = 'nnet', data = neuraldat, linout = TRUE)

garson(mod, 'Y1')

Run the code above in your browser using DataLab