Learn R Programming

DAAG (version 0.93)

compareTreecalcs: Error rate comparisons for tree-based classification

Description

Compare error rates, between different functions and different selection rules, for an approximately equal random division of the data into a training and test set.

Usage

compareTreecalcs(x = yesno ~ ., data = spam7, cp = 0.00025,
                 fun = c("rpart", "randomForest"))

Arguments

x
model formula
data
an data frame in which to interpret the variables named in the formula
cp
setting for the cost complexity parameter cp, used by rpart()
fun
one or both of "rpart" and "randomForest"

Value

  • If rpart is specified in fun, the following:
  • rpSEcvIthe estimated cross-validation error rate when rpart() is run on the training data (I), and the one-standard error rule is used
  • rpcvIthe estimated cross-validation error rate when rpart() is run on subset I, and the model used that gives the minimum cross-validated error rate
  • rpSEtestthe error rate when the model that leads to rpSEcvI is used to make predictions for subset II
  • rptestthe error rate when the model that leads to rpcvI is used to make predictions for subset II
  • nSErulenumber of splits required by the one standard error rule
  • nREminnumber of splits to give the minimum error
  • If rpart is specified in fun, the following:
  • rfcvIthe out-of-bag (OOB) error rate when randomForest() is run on subset I
  • rftestthe error rate when the model that leads to rfcvI is used to make predictions for subset II

Details

Data are randomly divided into two subsets, I and II. The function(s) are used in the standard way for calculations on subset I, and error rates returined that come from the calculations carried out by the function(s). Predictions are made for subset II, allowing the calculation of a completely independent set of error rates.