rf.cv: The Cross-Validation of Classification and Regression models using Random Forest

Description

The Cross-Validation of Classification and Regression models using Random Forest

Usage

rf.cv(xtr, ytr, cv.fold = 5, type = "regression", trees = 500,
  mtrysize = 10)

Arguments

xtr

A data frame or a matrix of predictors.

ytr

A response vector. If a factor, classification is assumed, otherwise regression is assumed.

cv.fold

The fold, the defalut is 5.

type

method type.

trees

Number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times.

mtrysize

Number of variables randomly sampled as candidates at each split. Note that the default values are different for classification (sqrt(p) where p is number of variables in xtr) and regression (p/3)

Value

if type is regression, the retrun a list containing four components:

RFpred - the predicted values of the input data based on cross-validation
Error - error for all samples
RMSECV - Root Mean Square Error for cross-validation
Q2 - R2 for cross-validation

if type is classification, the retrun a list containing four components:

table - confusion matrix
ACC - accuracy
SE - sensitivity
SP - specifivity
F1 - a measure of a test's accuracy.
MCC - Mathews correlation coefficient
RFPred - the predicted values
prob - the predicted probability values

Details

rf.cv implements Breiman's random forest algorithm for classification and regression. here we use it to make a k-fold cross-validation

References

Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.

Examples

Run this code

# NOT RUN {
training = read.csv(system.file('sysdata/training2.csv', package = 'BioMedR'), header = TRUE)
y = training[, 1]
x = training[, -1]
rf.tr <- rf.cv(x, y)
# }

Run the code above in your browser using DataLab