mlr (version 2.10)

TuneControl: Create control structures for tuning.

Description

The following tuners are available:
makeTuneControlGrid
Grid search. All kinds of parameter types can be handled. You can either use their correct param type and resolution, or discretize them yourself by always using makeDiscreteParam in the par.set passed to tuneParams.
makeTuneControlRandom
Random search. All kinds of parameter types can be handled.
makeTuneControlDesign
Completely pre-specifiy a data.frame of design points to be evaluated during tuning. All kinds of parameter types can be handled.
makeTuneControlCMAES
CMA Evolution Strategy with method cma_es. Can handle numeric(vector) and integer(vector) hyperparameters, but no dependencies. For integers the internally proposed numeric values are automatically rounded. The sigma variance parameter is initialized to 1/4 of the span of box-constraints per parameter dimension.
makeTuneControlGenSA
Generalized simulated annealing with method GenSA. Can handle numeric(vector) and integer(vector) hyperparameters, but no dependencies. For integers the internally proposed numeric values are automatically rounded.
makeTuneControlIrace
Tuning with iterated F-Racing with method irace. All kinds of parameter types can be handled. We return the best of the final elite candidates found by irace in the last race. Its estimated performance is the mean of all evaluations ever done for that candidate. More information on irace can be found in the TR at http://iridia.ulb.ac.be/IridiaTrSeries/link/IridiaTr2011-004.pdf.
Some notes on irace: For resampling you have to pass a ResampleDesc, not a ResampleInstance. The resampling strategy is randomly instantiated n.instances times and these are the instances in the sense of irace (instances element of tunerConfig in irace). Also note that irace will always store its tuning results in a file on disk, see the package documentation for details on this and how to change the file path.

Usage

makeTuneControlCMAES(same.resampling.instance = TRUE, impute.val = NULL,
  start = NULL, tune.threshold = FALSE, tune.threshold.args = list(),
  log.fun = NULL, final.dw.perc = NULL, budget = NULL, ...)

makeTuneControlDesign(same.resampling.instance = TRUE, impute.val = NULL, design = NULL, tune.threshold = FALSE, tune.threshold.args = list(), log.fun = NULL, budget = NULL)

makeTuneControlGenSA(same.resampling.instance = TRUE, impute.val = NULL, start = NULL, tune.threshold = FALSE, tune.threshold.args = list(), log.fun = NULL, final.dw.perc = NULL, budget = NULL, ...)

makeTuneControlGrid(same.resampling.instance = TRUE, impute.val = NULL, resolution = 10L, tune.threshold = FALSE, tune.threshold.args = list(), log.fun = NULL, final.dw.perc = NULL, budget = NULL)

makeTuneControlIrace(impute.val = NULL, n.instances = 100L, show.irace.output = FALSE, tune.threshold = FALSE, tune.threshold.args = list(), log.fun = NULL, final.dw.perc = NULL, budget = NULL, ...)

makeTuneControlRandom(same.resampling.instance = TRUE, maxit = NULL, tune.threshold = FALSE, tune.threshold.args = list(), log.fun = NULL, final.dw.perc = NULL, budget = NULL)

Arguments

same.resampling.instance
[logical(1)] Should the same resampling instance be used for all evaluations to reduce variance? Default is TRUE.
impute.val
[numeric] If something goes wrong during optimization (e.g. the learner crashes), this value is fed back to the tuner, so the tuning algorithm does not abort. It is not stored in the optimization path, an NA and a corresponding error message are logged instead. Note that this value is later multiplied by -1 for maximization measures internally, so you need to enter a larger positive value for maximization here as well. Default is the worst obtainable value of the performance measure you optimize for when you aggregate by mean value, or Inf instead. For multi-criteria optimization pass a vector of imputation values, one for each of your measures, in the same order as your measures.
start
[list] Named list of initial parameter values.
tune.threshold
[logical(1)] Should the threshold be tuned for the measure at hand, after each hyperparameter evaluation, via tuneThreshold? Only works for classification if the predict type is “prob”. Default is FALSE.
tune.threshold.args
[list] Further arguments for threshold tuning that are passed down to tuneThreshold. Default is none.
log.fun
[function | NULL] Function used for logging. If set to NULL, the internal default will be used. Otherwise a function with arguments learner, resampling, measures, par.set, control, opt.path, dob, x, y, remove.nas, and stage is expected. The default displays the performance measures, the time needed for evaluating, the currently used memory and the max memory ever used before (the latter two both taken from gc). See the implementation for details.
final.dw.perc
[boolean] If a Learner wrapped by a makeDownsampleWrapper is used, you can define the value of dw.perc which is used to train the Learner with the final parameter setting found by the tuning. Default is NULL which will not change anything.
budget
[integer(1)] Maximum budget for tuning. This value restricts the number of function evaluations. In case of makeTuneControlGrid this number must be identical to the size of the grid. For makeTuneControlRandom the budget equals the number of iterations (maxit) performed by the random search algorithm. Within the cma_es the budget corresponds to the product of the number of generations (maxit) and the number of offsprings per generation (lambda). GenSA defines the budget via the argument max.call. However, one should note that this algorithm does not stop its local search before its end. This behaviour might lead to an extension of the defined budget and will result in a warning. In irace, budget is passed to maxExperiments.
...
[any] Further control parameters passed to the control arguments of cma_es or GenSA, as well as towards the tunerConfig argument of irace.
design
[data.frame] data.frame containing the different parameter settings to be evaluated. The columns have to be named according to the ParamSet which will be used in tune(). Proper designs can be created with generateDesign for instance.
resolution
[integer] Resolution of the grid for each numeric/integer parameter in par.set. For vector parameters, it is the resolution per dimension. Either pass one resolution for all parameters, or a named vector. See generateGridDesign. Default is 10.
n.instances
[integer(1)] Number of random resampling instances for irace, see details. Default is 100.
show.irace.output
[logical(1)] Show console output of irace while tuning? Default is FALSE.
maxit
[integer(1) | NULL] Number of iterations for random search. Default is 100.

Value

[TuneControl]. The specific subclass is one of TuneControlGrid, TuneControlRandom, TuneControlCMAES, TuneControlGenSA, TuneControlIrace.

See Also

Other tune: getNestedTuneResultsOptPathDf, getNestedTuneResultsX, getTuneResult, makeModelMultiplexerParamSet, makeModelMultiplexer, makeTuneWrapper, tuneParams, tuneThreshold