Runs a benchmark on arbitrary combinations of tasks (Task), learners (Learner), and resampling strategies (Resampling), possibly in parallel.
benchmark(design, store_models = FALSE)
(data.frame()
)
Data frame (or data.table::data.table()
) with three columns: "task", "learner", and "resampling".
Each row defines a resampling by providing a Task, Learner and an instantiated Resampling strategy.
The helper function benchmark_grid()
can assist in generating an exhaustive design (see examples) and
instantiate the Resamplings per Task.
(logical(1)
)
Keep the fitted model after the test set has been predicted?
Set to TRUE
if you want to further analyse the models or want to
extract information like variable importance.
This function can be parallelized with the future package.
One job is one resampling iteration, and all jobs are send to an apply function
from future.apply in a single batch.
To select a parallel backend, use future::plan()
.
This function supports progress bars via the package progressr.
Simply wrap the function in progressr::with_progress()
to enable them.
We recommend to use package progress as backend; enable with
progressr::handlers("progress")
.
The mlr3 uses the lgr package for logging.
lgr supports multiple log levels which can be queried with
getOption("lgr.log_levels")
.
To suppress output and reduce verbosity, you can lower the log from the
default level "info"
to "warn"
:
lgr::get_logger("mlr3")$set_threshold("warn")
To get additional log output for debugging, increase the log level to "debug"
or "trace"
:
lgr::get_logger("mlr3")$set_threshold("debug")
To log to a file or a data base, see the documentation of lgr::lgr-package.
# NOT RUN {
# benchmarking with benchmark_grid()
tasks = lapply(c("iris", "sonar"), tsk)
learners = lapply(c("classif.featureless", "classif.rpart"), lrn)
resamplings = rsmp("cv", folds = 3)
design = benchmark_grid(tasks, learners, resamplings)
print(design)
set.seed(123)
bmr = benchmark(design)
## Data of all resamplings
head(as.data.table(bmr))
## Aggregated performance values
aggr = bmr$aggregate()
print(aggr)
## Extract predictions of first resampling result
rr = aggr$resample_result[[1]]
as.data.table(rr$prediction())
# Benchmarking with a custom design:
# - fit classif.featureless on iris with a 3-fold CV
# - fit classif.rpart on sonar using a holdout
tasks = list(tsk("iris"), tsk("sonar"))
learners = list(lrn("classif.featureless"), lrn("classif.rpart"))
resamplings = list(rsmp("cv", folds = 3), rsmp("holdout"))
design = data.table::data.table(
task = tasks,
learner = learners,
resampling = resamplings
)
## Instantiate resamplings
design$resampling = Map(
function(task, resampling) resampling$clone()$instantiate(task),
task = design$task, resampling = design$resampling
)
## Run benchmark
bmr = benchmark(design)
print(bmr)
## Get the training set of the 2nd iteration of the featureless learner on iris
rr = bmr$aggregate()[learner_id == "classif.featureless"]$resample_result[[1]]
rr$resampling$train_set(2)
# }
Run the code above in your browser using DataLab