plot: Plot of VSURF results

Description

This function plots 4 graphs illustrating VSURF results.

Usage

## S3 method for class 'VSURF':
plot(x, nvar.imp=NULL, nvar.sd=NULL, var.names=FALSE, ...)
## S3 method for class 'VSURF.thres':
plot(x, nvar.imp=NULL, nvar.sd=NULL, imp=TRUE,
     imp.sd=TRUE, var.names=FALSE, ...)
## S3 method for class 'VSURF.interp':
plot(x, var.names=FALSE,...)
## S3 method for class 'VSURF.pred':
plot(x, var.names=FALSE, ...)

Arguments

An object of class VSURF, VSURF.thres, VSURF.interp or VSURF.pred, which is the result of the VSURF function (or resp.

nvar.imp

The number of variables to be kept for the VI mean plot (top left graph).

nvar.sd

The number of variables to be kept for the VI standard deviation plot (top left graph).

imp

If TRUE (default) VI mean is plotted, if FALSE not.

imp.sd

If TRUE (default) VI standard deviation is plotted, if FALSE not.

var.names

If FALSE (default) xticks are the numbering given by the sorting of VI mean, if TRUE they are the variables names.

...

Arguments to be passed to par (they will affect all plots)

Details

The 2 graphs of the top row correspond to the "thresholding step" (and only these 2 graphs are plotted by the plot.VSURF.thres function). The top left graph plots the mean variable importance in decreasing order (black curve). The red horizontal line represent the value of the threshold. The top right graph plots the standard deviation of variable importance with variables ordered according to their mean variable importance in decreasing order (black curve). The green line represents the predictions given by a CART tree fitted to the black curve (the standard deviations). Finally, the dotted horizontal red line represents the minimum value of the CART predictions, which actually is the value of the threshold. The bottom left graph corresponds to the "interpretation step" (and only this graph is plotted by the plot.VSURF.interp function). It plots the mean OOB error rate of embedded random forests models (from the one with only one variable as predictor, to the one with all variables kept after the "thresholding step"). The vertical red line indicates the retained model. The bottom right graph corresponds to the "predicton step" (and only this graph is plotted by the plot.VSURF.pred function). It plots the mean OOB error rate of embedded random forests models (the difference, here, being that variables are added to the model in a step-wise manner). The retained model is the final one.

References

Genuer, R. and Poggi, J.M. and Tuleau-Malot, C. (2010), Variable selection using random forests, Pattern Recognition Letters 31(14), 2225-2236

Examples

Run this code

data(iris)
iris.vsurf <- VSURF(x=iris[,1:4], y=iris[,5])
plot(iris.vsurf)
plot(iris.vsurf, var.names=TRUE)

# A more interesting example with toys data (see \code{\link{toys}})
# (a few minutes to execute) and intermediate functions
data(toys)
toys.vsurf <- VSURF(x=toys$x, y=toys$y)
plot(toys.vsurf)
plot(toys.vsurf, nvar.imp=50, nvar.sd=50)
toys.thres <- VSURF.thres(x=toys$x, y=toys$y)
plot(toys.thres)
par(mfrow=c(1,1))
plot(toys.thres, nvar.imp=70, imp.sd=FALSE)
toys.interp <- VSURF.interp(x=toys$x, y=toys$y, vars=toys.thres$varselect.thres)
plot(toys.interp, var.names=TRUE)
toys.pred <- VSURF.pred(x=toys$x, y=toys$y, err.interp=toys.interp$err.interp,
                        varselect.interp=toys.interp$varselect.interp)
plot(toys.pred, var.names=TRUE)