Pairwise Plots and Variable Importancs Plot for Ada
This command produces pairwise plots of the data. The data in
the upper panel of pairwise plots colors the observations by observed class
membership (if membership is provided). The lower panel of pairwise plots
colors the observations by predicted classes. In addition, the plotting
symbol is scaled by the the class probability estimate from by adaboost.
varplot command produces a variable importance plot using the
improve criteria given in the reference (Hastie et al.,2001, pg332). This
is a rather standard measure for determining variable importance.
"pairs"(x, train.data = NULL, vars = NULL, maxvar = 10, test.x = NULL, test.y = NULL, test.only = FALSE,col=c(2,4),pch=c(1,2), ...)varplot(x, plot.it = TRUE, type = c("none","scores"),max.var.show=30, ...)
- object generated by ada.
- the data.frame of the orgianal data used to train the classifier. The names of this data.frame must be the same as the variable names as the object generated by ada. x.data is used by both the pairs command. Default = NULL.
- a vector of variables to include for this plot. The variable number must correspond to a specific column in x. For example, vars=c(1,2), generates a plot for the first two columns for x.data. Note: vars is only used for the pairs command. Default = NULL.
- the maximum number of variables for the pairwise plot. If maxvar = 5, then varplot chooses the the five most important variables and places these in desending order in the plot. Maxvar is only used for the pairs command. Default = 10.
- an option to plot pairwise descriptors for a test data set. test.data should be of type data.frame. test.data is only used for the pairs command. Default = NULL.
- the corresponding response for the test data set. If test.response is not specified, then the color of the symbols for the test data in the pairwise plots are black; training data are colored by class. test.response is only used for the pairs command. Default = NULL.
- provides pairwise plots for test data only (test.only = TRUE). Default = FALSE. If test.response is not specified, then test.only is ignored. test.only is only used for the pairs command. Default = NULL.
- color for plot symbols one for each class. Defualt col=c(2,4) (i.e. red and blue)
- pch for plot set two symbols. Defualt pch=c(1,2) (i.e. circle and triangle)
- Arguments to be passed into pairs.default. Do not set the upper and lower panel. This is only used for the pairs command.
- provides a plot of frequencies for each variable (plot.it = TRUE). plot.it is only used for the varplot command. Default = NULL.
- if type=none then nothing is returned. Default = none. If type=scores, the frequencies are returned.
- if plot.it is TRUE then this controls the number of variables shown for the plot
The varplot command provides a sense of variable importance--the more frequently a variable is selected for boosting, the more likely the variable contains useful information for classification. Pairwise interactions of important variables can then be visualized using varplot. Note: The pairs command calls the varplot command.
- If type=scores then the frequencies for each variable is returned by the varplot command.
This plot was designed as tool to use with adaboost. Please send any comments or suggestions for improvement to the authors.
Culp, M., Johnson, K., Michailidis, G. (200X). ada: an R Package for Boosting Journal of Statistical Software, (XX)XX