tuneRanger v0.5

0

Monthly downloads

0th

Percentile

Tune Random Forest of the 'ranger' Package

Tuning random forest with one line. The package is mainly based on the packages 'ranger' and 'mlrMBO'.

Readme

tuneRanger: A package for tuning random forests

Philipp Probst

Installation

The development version

devtools::install_github("PhilippPro/tuneRanger")

CRAN

install.packages("tuneRanger")

Description

tuneRanger is a package for automatic tuning of random forests with one line of code and intended for users that want to get the best out of their random forest model.

Model based optimization is used as tuning strategy and the three parameters min.node.size, sample.fraction and mtry are tuned at once. Out-of-bag predictions are used for evaluation, which makes it much faster than other packages and tuning strategies that use for example 5-fold cross-validation. Classification as well as regression is supported.

The measure that should be optimized can be chosen from the list of measures in mlr: https://mlr-org.github.io/mlr/articles/measures.html

The package is mainly based on ranger, mlrMBO and mlr.

The package is also described in an arXiv-Paper: https://arxiv.org/abs/1804.03515

Benchmark

You can see a benchmark for classification in the paper.

Moreover, for regression I compared three different tuning implementations (tuneRanger, autoxgboost and liquidSVM on 29 regression tasks in their default mode and the default of ranger. The results of the 5-fold cross-validation show the competitiveness of tuneRanger and can be seen in the following graphs:

R-Squared

Spearmans-Rho

Training time

A disadvantage is the long runtime (e.g. compared to liquidSVM), improvements could be made on this issue.

Code for the two benchmarks is available here and here.

Usage

Quickstart:

library(tuneRanger)
library(mlr)

# A mlr task has to be created in order to use the package
# We make an mlr task with the iris dataset here 
# (Classification task with makeClassifTask, Regression Task with makeRegrTask)
iris.task = makeClassifTask(data = iris, target = "Species")

# Rough Estimation of the Tuning time
estimateTimeTuneRanger(iris.task)

# Tuning process (takes around 1 minute); Tuning measure is the multiclass brier score
res = tuneRanger(iris.task, measure = list(multiclass.brier), num.trees = 1000, 
             num.threads = 2, iters = 70)

# Mean of best 5 % of the results
res
# Model with the new tuned hyperparameters
res$model

# Restart after failing in one of the iterations:
res = restartTuneRanger("./optpath.RData", iris.task, measure = list(multiclass.brier))

How to cite

Please cite the paper, if you use the package:

@ARTICLE{tuneRanger,
  author = {Probst, Philipp and Wright, Marvin and Boulesteix, Anne-Laure}, 
  title = {Hyperparameters and Tuning Strategies for Random Forest},
  journal = {ArXiv preprint arXiv:1804.03515},
  archivePrefix = "arXiv",
  eprint = {1804.03515},
  primaryClass = "stat.ML",
  keywords = {Statistics - Machine Learning, Computer Science - Learning},
  year = 2018,
  url = {https://arxiv.org/abs/1804.03515}
}

Functions in tuneRanger

Name Description
estimateTimeTuneRanger estimateTimeTuneRanger
restartTuneRanger restartTuneRanger
tuneRanger tuneRanger
tuneMtryFast tuneMtryFast
No Results!

Last month downloads

Details

Type Package
License GPL-3
Encoding UTF-8
LazyData yes
ByteCompile yes
Date 2019-04-16
RoxygenNote 6.1.1
NeedsCompilation no
Packaged 2019-04-16 14:52:16 UTC; probst
Repository CRAN
Date/Publication 2019-04-16 16:02:58 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/tuneRanger)](http://www.rdocumentation.org/packages/tuneRanger)