rfPredVar(random.forest, rf.data, pred.data = rf.data, CI = FALSE, tree.type = "rf", prog.bar = FALSE)keep.inbag=TRUE. See
details for more information.rfrf.data if not giventree.type='ci')keep.inbag=TRUE is supplied
only for the purpose of defining the resampling scheme. The function builds
a new random forest based on the tree.type setting. However, the
resamples are maintained identically to the supplied random forest. This
allows for direct comparison of the tree methods without having to account
for variation in resampling. Currently, the CI methods are much more computationally intensive because
there is no C implementation of the CI random forest method that indicates
the number of times that each sample is included in each resample. In
order to carry out our simulations using $V_IJ^B$, we had to use a
pure R implementation of CI random forests. This is different for CART
random forests, where a C implementation already exists in the
randomForest package. However, it should be noted that the
difference in computational times is due to the random forest creation
step, not the implementation of $V_IJ^B$. This should not be an
issue in the future when a C implementation of CI random forests is
created.
Note: This function does not use the default predict method for forests
produced by cforest. The predictions here are the direct averages of
all tree predictions, instead of using the observation weights. Therefore,
predictions from this function will likely differ from
predict.cforest when using subsampling.
This function currently only works with regression forests -- not classification forests.
library(randomForest)
data(airquality)
d <- na.omit(airquality)
rf <- randomForest(Ozone ~ .,data=d,keep.inbag=TRUE,sampsize=30,replace=FALSE,ntree=500)
rfPredVar(rf,rf.data=d,CI=TRUE,tree.type='rf')
Run the code above in your browser using DataLab