MLTuneParameters: R6 Class to perform hyperparameter tuning experiments

Description

The MLTuneParameters class is used to construct a parameter tuner object and to perform the tuning of a set of hyperparameters for a specified machine learning algorithm using either a grid search or a Bayesian optimization.

Arguments

Super classes

mlexperiments::MLBase -> mlexperiments::MLExperimentsBase -> MLTuneParameters

Public fields

parameter_bounds: A named list of tuples to define the parameter bounds of the Bayesian hyperparameter optimization. For further details please see the documentation of the ParBayesianOptimization package.
parameter_grid: A matrix with named columns in which each column represents a parameter that should be optimized and each row represents a specific hyperparameter setting that should be tested throughout the procedure. For strategy = "grid", each row of the parameter_grid is considered as a setting that is evaluated. For strategy = "bayesian", the parameter_grid is passed further on to the initGrid argument of the function ParBayesianOptimization::bayesOpt() in order to initialize the Bayesian process. The maximum rows considered for initializing the Bayesian process can be specified with the R option option("mlexperiments.bayesian.max_init"), which is set to 50L by default.
optim_args: A named list of tuples to define the parameter bounds of the Bayesian hyperparameter optimization. For further details please see the documentation of the ParBayesianOptimization package.
split_type: A character. The splitting strategy to construct the k cross-validation folds. This parameter is passed further on to the function splitTools::create_folds() and defaults to "stratified".
split_vector: A vector If another criteria than the provided y should be considered for generating the cross-validation folds, it can be defined here. It is important, that a vector of the same length as x is provided here.

Methods

Public methods

Inherited methods

mlexperiments::MLExperimentsBase$set_data()

Method `new()`

Create a new MLTuneParameters object.

Usage

MLTuneParameters$new(
  learner,
  seed,
  strategy = c("grid", "bayesian"),
  ncores = -1L
)

Arguments

learner: An initialized learner object that inherits from class "MLLearnerBase".

seed

An integer. Needs to be set for reproducibility purposes.

strategy

A character. The strategy to optimize the hyperparameters (either "grid" or "bayesian").

ncores

An integer to specify the number of cores used for parallelization (default: -1L).

Details

For strategy = "bayesian", the number of starting iterations can be set using the R option "mlexperiments.bayesian.max_init", which defaults to 50L. This option reduces the provided initialization grid to contain at most the specified number of rows. This initialization grid is then further passed on to the initGrid argument of ParBayesianOptimization::bayesOpt.

Returns

A new MLTuneParameters R6 object.

Examples

MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)

Method `execute()`

Execute the hyperparameter tuning.

Usage

MLTuneParameters$execute(k)

Arguments

k: An integer to define the number of cross-validation folds used to tune the hyperparameters.

Details

All results of the hyperparameter tuning are saved in the field $results of the MLTuneParameters class. After successful execution of the parameter tuning, $results contains a list with the items

"summary": A data.table with the summarized results (same as the returned value of the execute method).
"best.setting": The best setting (according to the learner's parameter metric_optimization_higher_better) identified during the hyperparameter tuning.
"bayesOpt": The returned value of ParBayesianOptimization::bayesOpt() (only for strategy = "bayesian").

Returns

A data.table with the results of the hyperparameter optimization. The optimized metric, i.e. the cross-validated evaluation metric is given in the column metric_optim_mean. More results are accessible from the field $results of the MLTuneParameters class.

Examples

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
tuner <- MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)
tuner$parameter_bounds <- list(k = c(2L, 80L))
tuner$parameter_grid <- expand.grid(
  k = seq(4, 68, 8),
  l = 0,
  test = parse(text = "fold_test$x")
)
tuner$split_type <- "stratified"
tuner$optim_args <- list(
  iters.n = 4,
  kappa = 3.5,
  acq = "ucb"
)

# set data tuner$set_data( x = data.matrix(dataset[, -7]), y = dataset[, 7] )

tuner$execute(k = 3)

Method `clone()`

The objects of this class are cloneable with this method.

Usage

MLTuneParameters$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Details

The hyperparameter tuning can be performed with a grid search or a Bayesian optimization. In both cases, each hyperparameter setting is evaluated in a k-fold cross-validation on the dataset specified.

Examples

Run this code

knn_tuner <- MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)


## ------------------------------------------------
## Method `MLTuneParameters$new`
## ------------------------------------------------

MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)


## ------------------------------------------------
## Method `MLTuneParameters$execute`
## ------------------------------------------------

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
tuner <- MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)
tuner$parameter_bounds <- list(k = c(2L, 80L))
tuner$parameter_grid <- expand.grid(
  k = seq(4, 68, 8),
  l = 0,
  test = parse(text = "fold_test$x")
)
tuner$split_type <- "stratified"
tuner$optim_args <- list(
  iters.n = 4,
  kappa = 3.5,
  acq = "ucb"
)

# set data
tuner$set_data(
  x = data.matrix(dataset[, -7]),
  y = dataset[, 7]
)

tuner$execute(k = 3)

Run the code above in your browser using DataLab

Description

Arguments

Super classes

Public fields

Methods

Public methods

Method new()

Usage

Arguments

Details

Returns

Examples

Method execute()

Usage

Arguments

Details

Returns

Examples

Method clone()

Usage

Arguments

Details

See Also

Examples

Method `new()`

Method `execute()`

Method `clone()`