VSURF(x, y, ntree=500,
mtry=if (!is.factor(y)) max(floor(ncol(x)/3), 1)
else floor(sqrt(ncol(x))),
nfor.thres=50, nmin=1, nfor.interp=25, nsd=1, nfor.pred=25, nmj=1)
randomForest
.randomForest
.err.interp
is multiplied.VSURF
, which is a list with the following components:varselect.thres
variables.$x
contains the mean importances sorted in decreasing order. $ix
contains indexes of the variables.ord.imp
.err.interp
is multiplied.nfor.thres
random forests are computed using the functionrandomForest
with argumentsimportance=TRUE
. Then variables are sorted
according to their mean variable importance (VI), in decreasing order.
This order is kept all along the procedure. Next, a threshold is
computed:min.thres
, the minimum predicted value of a pruned
CART tree fitted to the curve of the standard deviations of VI.
Finally, the actual "thresholding step" is performed: only variables
with a mean VI larger thannmin
*min.thres
are kept.nfor.interp
embedded random
forests models are grown, starting with the random forest build with
only the most important variable and ending with all variables
selected in the first step. Then,err.min
the minimum mean out-of-bag (OOB) error of these models and
its associated standard deviationsd.min
are computed.
Finally, the smallest model (and hence its corresponding variables)
having a mean OOB error less thanerr.min
+nsd
*sd.min
is selected.mean.jump
, the mean jump value
is calculated using variables that have been left out by the second
step, and is set as the mean absolute difference between mean OOB
errors of one model and its first following model.
Hence a variable is included in the model if the mean OOB error
decrease is larger thannmj
*mean.jump
.plot.VSURF
, summary.VSURF
, VSURF.thres
,
VSURF.interp
, VSURF.pred
data(iris)
iris.vsurf <- VSURF(x=iris[,1:4], y=iris[,5], ntree=100, nfor.thres=20,
nfor.interp=10, nfor.pred=10)
iris.vsurf
# A more interesting example with toys data (see ?toys)
# (less than 1 min to execute)
data(toys)
toys.vsurf <- VSURF(x=toys$x, y=toys$y)
toys.vsurf
Run the code above in your browser using DataLab