OOBCurve (version 0.3)

OOBCurve: Out of Bag Learning curve

Description

With the help of this function the out of bag learning curve for random forests can be created for any measure that is available in the mlr package.

Usage

OOBCurve(mod, measures = list(auc), task, data)

Arguments

mod

An object of class randomForest or ranger, as that created by the function randomForest/ranger with option keep.inbag = TRUE. Alternatively you can also use a randomForest or ranger model trained with train of mlr.

measures

List of performance measure(s) of mlr to evaluate. Default is auc only. See the mlr tutorial for a list of available measures for the corresponding task.

task

Learning task created by the function makeClassifTask or makeRegrTask of mlr.

data

Original data that was used for training the random forest.

Value

Returns a dataframe with a column for each desired measure.

See Also

OOBCurvePars for out-of-bag curves of other parameters.

Examples

Run this code
# NOT RUN {
library(mlr)
library(ranger)

# Classification
data = getTaskData(sonar.task)
sonar.task = makeClassifTask(data = data, target = "Class")
lrn = makeLearner("classif.ranger", keep.inbag = TRUE, par.vals = list(num.trees = 100))
mod = train(lrn, sonar.task)

# Alternatively use ranger directly
# mod = ranger(Class ~., data = data, num.trees = 100, keep.inbag = TRUE)
# Alternatively use randomForest
# mod = randomForest(Class ~., data = data, ntree = 100, keep.inbag = TRUE)

# Application of the main function
results = OOBCurve(mod, measures = list(mmce, auc, brier), task = sonar.task, data = data)
# Plot the generated results
plot(results$mmce, type = "l", ylab = "oob-mmce", xlab = "ntrees")
plot(results$auc, type = "l", ylab = "oob-auc", xlab = "ntrees")
plot(results$brier, type = "l", ylab = "oob-brier-score", xlab = "ntrees")

# Regression
data = getTaskData(bh.task)
bh.task = makeRegrTask(data = data, target = "medv")
lrn = makeLearner("regr.ranger", keep.inbag = TRUE, par.vals = list(num.trees = 100))
mod = train(lrn, bh.task)

# Application of the main function
results = OOBCurve(mod, measures = list(mse, mae, rsq), task = bh.task, data = data)
# Plot the generated results
plot(results$mse, type = "l", ylab = "oob-mse", xlab = "ntrees")
plot(results$mae, type = "l", ylab = "oob-mae", xlab = "ntrees")
plot(results$rsq, type = "l", ylab = "oob-mae", xlab = "ntrees")

# }

Run the code above in your browser using DataCamp Workspace