
Last chance! 50% off unlimited learning
Sale ends in
The Cross-Validation of Classification and Regression models using Random Forest
rf.cv(xtr, ytr, cv.fold = 5, type = "regression", trees = 500,
mtrysize = 10)
A data frame or a matrix of predictors.
A response vector. If a factor, classification is assumed, otherwise regression is assumed.
The fold, the defalut is 5.
method type.
Number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times.
Number of variables randomly sampled as candidates at each
split. Note that the default values are different for
classification (sqrt(p) where p is number of variables in
xtr
) and regression (p/3)
if type is regression, the retrun a list containing four components:
RFpred
- the predicted values of the input data based on cross-validation
Error
- error for all samples
RMSECV
- Root Mean Square Error for cross-validation
Q2
- R2 for cross-validation
if type is classification, the retrun a list containing four components:
table
- confusion matrix
ACC
- accuracy
SE
- sensitivity
SP
- specifivity
F1
- a measure of a test's accuracy.
MCC
- Mathews correlation coefficient
RFPred
- the predicted values
prob
- the predicted probability values
rf.cv
implements Breiman's random forest algorithm for classification and
regression. here we use it to make a k-fold cross-validation
Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.
See pls.cv
for the Cross-Validation of Classification and
Regression models using PLS
# NOT RUN {
training = read.csv(system.file('sysdata/training2.csv', package = 'BioMedR'), header = TRUE)
y = training[, 1]
x = training[, -1]
rf.tr <- rf.cv(x, y)
# }
Run the code above in your browser using DataLab