This function shows the cross-validated prediction performance of models with sequentially reduced number of predictors (ranked by variable importance) via a nested cross-validation procedure.
rfcv(trainx, trainy, cv.fold=5, scale="log", step=0.5,
mtry=function(p) max(1, floor(sqrt(p))), recursive=FALSE, ...)
matrix or data frame containing columns of predictor variables
vector of response, must have length equal to the number
of rows in trainx
number of folds in the cross-validation
if "log"
, reduce a fixed proportion (step
)
of variables at each step, otherwise reduce step
variables at a
time
if log=TRUE
, the fraction of variables to remove at
each step, else remove this many variables at a time
a function of number of remaining predictor variables to
use as the mtry
parameter in the randomForest
call
whether variable importance is (re-)assessed at each step of variable reduction
other arguments passed on to randomForest
A list with the following components:
list(n.var=n.var, error.cv=error.cv, predicted=cv.pred)
vector of number of variables used at each step
corresponding vector of error rates or MSEs at each step
list of n.var
components, each containing
the predicted values from the cross-validation
Svetnik, V., Liaw, A., Tong, C. and Wang, T., ``Application of Breiman's Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules'', MCS 2004, Roli, F. and Windeatt, T. (Eds.) pp. 334-343.
# NOT RUN { set.seed(647) myiris <- cbind(iris[1:4], matrix(runif(96 * nrow(iris)), nrow(iris), 96)) result <- rfcv(myiris, iris$Species, cv.fold=3) with(result, plot(n.var, error.cv, log="x", type="o", lwd=2)) ## The following can take a while to run, so if you really want to try ## it, copy and paste the code into R. # } # NOT RUN { result <- replicate(5, rfcv(myiris, iris$Species), simplify=FALSE) error.cv <- sapply(result, "[[", "error.cv") matplot(result[[1]]$n.var, cbind(rowMeans(error.cv), error.cv), type="l", lwd=c(2, rep(1, ncol(error.cv))), col=1, lty=1, log="x", xlab="Number of variables", ylab="CV Error") # }