tuneRanger: tuneRanger

Description

Automatic tuning of random forests of the ranger package with one line of code.

Usage

tuneRanger(task, measure = NULL, iters = 70, iters.warmup = 30,
  time.budget = NULL, num.threads = NULL, num.trees = 1000,
  parameters = list(replace = FALSE, respect.unordered.factors =
  "order"), tune.parameters = c("mtry", "min.node.size",
  "sample.fraction"), save.file.path = NULL, build.final.model = TRUE,
  show.info = getOption("mlrMBO.show.info", TRUE))

Value

A list with elements

recommended.pars: Recommended hyperparameters.
results: A data.frame with all evaluated hyperparameters and performance and time results for each run.
model: The final model if build.final.model set to TRUE.

Arguments

task: The mlr task created by makeClassifTask, makeRegrTask or makeSurvTask.
measure: Performance measure to evaluate/optimize. Default is brier score for classification and mse for regression. Can be changed to accuracy, AUC or logaritmic loss by setting it to list(acc), list(auc) or list(logloss). Other possible performance measures from mlr can be looked up in the mlr tutorial.
iters: Number of iterations. Default is 70.
iters.warmup: Number of iterations for the warmup. Default is 30.
time.budget: Running time budget in seconds. Note that the actual mbo run can take more time since the condition is checked after each iteration. The default NULL means: There is no time budget.
num.threads: Number of threads. Default is number of CPUs available.
num.trees: Number of trees.
parameters: Optional list of fixed named parameters that should be passed to ranger.
tune.parameters: Optional character vector of parameters that should be tuned. Default is mtry, min.node.size and sample.fraction. Additionally replace and respect.unordered.factors can be included in the tuning process.
save.file.path: File to which interim results are saved (e.g. "optpath.RData") in the current working directory. Default is NULL, which does not save the results. If a file was specified and one iteration fails the algorithm can be started again with restartTuneRanger.
build.final.model: [logical(1)]
Should the best found model be fitted on the complete dataset? Default is TRUE.
show.info: Verbose mlrMBO output on console? Default is TRUE.

Details

Model based optimization is used as tuning strategy and the three parameters min.node.size, sample.fraction and mtry are tuned at once. Out-of-bag predictions are used for evaluation, which makes it much faster than other packages and tuning strategies that use for example 5-fold cross-validation. Classification as well as regression is supported. The measure that should be optimized can be chosen from the list of measures in mlr: mlr tutorial

Examples

Run this code

if (FALSE) {
library(tuneRanger)
library(mlr)

# A mlr task has to be created in order to use the package
data(iris)
iris.task = makeClassifTask(data = iris, target = "Species")
 
# Estimate runtime
estimateTimeTuneRanger(iris.task)
# Tuning
res = tuneRanger(iris.task, measure = list(multiclass.brier), num.trees = 1000, 
  num.threads = 2, iters = 70, save.file.path = NULL)
  
# Mean of best 5 % of the results
res
# Model with the new tuned hyperparameters
res$model
# Prediction
predict(res$model, newdata = iris[1:10,])}