benchmark_grid: Generate a Benchmark Grid Design

Description

Takes a lists of Task, a list of Learner and a list of Resampling to generate a design in an expand.grid() fashion (a.k.a. cross join or Cartesian product).

There are two modes of operation, depending on the flag paired.

With paired set to FALSE (default), resampling strategies are not allowed to be instantiated, and instead will be instantiated per task internally. The only exception to this rule applies if all tasks have exactly the same number of rows, and the resamplings are all instantiated for such tasks. The grid will be generated based on the Cartesian product of tasks, learners, and resamplings.
With paired set to TRUE, tasks and resamplings are treated as pairs. I.e., you must provide as many tasks as corresponding instantiated resamplings. The grid will be generated based on the Cartesian product of learners and pairs.

Usage

benchmark_grid(tasks, learners, resamplings, paired = FALSE)

Value

(data.table::data.table()) with the cross product of the input vectors.

Arguments

tasks: (list of Task).
learners: (list of Learner).
resamplings: (list of Resampling).
paired: (logical(1))
Set this to TRUE if the resamplings are instantiated on the tasks, i.e., the tasks and resamplings are paired. You need to provide the same number of tasks and instantiated resamplings.

Examples

Run this code

tasks = list(tsk("penguins"), tsk("sonar"))
learners = list(lrn("classif.featureless"), lrn("classif.rpart"))
resamplings = list(rsmp("cv"), rsmp("subsampling"))

grid = benchmark_grid(tasks, learners, resamplings)
print(grid)
if (FALSE) {
benchmark(grid)
}

# paired
learner = lrn("classif.rpart")
task1 = tsk("penguins")
task2 = tsk("german_credit")
res1 = rsmp("holdout")
res2 = rsmp("holdout")
res1$instantiate(task1)
res2$instantiate(task2)
design = benchmark_grid(list(task1, task2), learner, list(res1, res2), paired = TRUE)
print(design)

# manual construction of the grid with data.table::CJ()
grid = data.table::CJ(task = tasks, learner = learners,
  resampling = resamplings, sorted = FALSE)

# manual instantiation (not suited for a fair comparison of learners!)
Map(function(task, resampling) {
  resampling$instantiate(task)
}, task = grid$task, resampling = grid$resampling)
if (FALSE) {
benchmark(grid)
}

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples