This function is depreciated. Please use bayesOpt
instead.
BayesianOptimization(
FUN,
bounds,
saveIntermediate = NULL,
leftOff = NULL,
parallel = FALSE,
packages = NULL,
export = NULL,
initialize = TRUE,
initGrid = NULL,
initPoints = 0,
bulkNew = 1,
nIters = 0,
kern = "Matern52",
beta,
acq = "ucb",
stopImpatient = list(newAcq = "ucb", rounds = Inf),
kappa = 2.576,
eps = 0,
gsPoints = 100,
convThresh = 1e+07,
minClusterUtility = NULL,
noiseAdd = 0.25,
plotProgress = TRUE,
verbose = 1
)
the function to be maximized. This function should return a
named list with at least 1 component. The first component must be named
Score
and should contain the metric to be maximized. You may
return other named scalar elements that you wish to include in the final
summary table.
named list of lower and upper bounds for each hyperparameter.
The names of the list should be arguments passed to FUN
.
Use "L" suffix to indicate integer hyperparameters.
character filepath (including file name) that
specifies the location to save intermediary results. This will save
the ScoreDT data.table as an RDS. This RDS is saved after every
iteration, and can be specified as the leftOff
parameter
so that you can continue a process where you left off.
data.table containing parameter-Score pairs. If supplied,
the process will rbind
this table to the parameter-Score pairs
obtained through initialization. This table should be obtained
from either the file saved by saveIntermediate
, or from the ScoreDT
data.table
returned by this function. WARNING: any parameters
not within bounds
will be removed before optimization takes place.
should the process run in parallel? If TRUE, several criteria must be met:
A parallel backend must be registered
FUN
must be executable using only packages specified in packages
(and base packages)
FUN
must be executable using only the the objects specified in export
The function must be thread safe.
character vector of the packages needed to run FUN
.
character vector of object names needed to evaluate FUN
.
should the process initialize a parameter-Score pair set?
If FALSE
, leftOff
must be provided.
user specified points to sample the scoring function, should
be a data.frame
or data.table
with identical column names as bounds.
Number of points to initialize the process with. Points are chosen with latin hypercube sampling within the bounds supplied.
integer that specifies the number of parameter combinations
to sample at each optimization step. If minClusterUtility
is NULL
then noise is added to the acquisition optimum to obtain other sampling points.
If running in parallel, good practice is to set bulkNew
to some multiple
of the number of cores you have designated for this process.
total number of parameter sets to be sampled, including initial set.
a character that gets mapped to one of GauPro's GauPro_kernel_beta
S6 classes. Determines the covariance function used in the gaussian process. Can be one of:
"Gaussian"
"Exponential"
"Matern52"
"Matern32"
Depreciated. The kernel lengthscale parameter log10(theta). Passed to GauPro_kernel_beta
specified in kern.
acquisition function type to be used. Can be "ucb", "ei", "eips" or "poi".
ucb
Upper Confidence Bound
ei
Expected Improvement
eips
Expected Improvement Per Second
poi
Probability of Improvement
a list containing rounds
and newAcq
,
if acq = "eips"
you can switch the acquisition function to newAcq
after rounds
parameter-score pairs are found.
tunable parameter kappa of the upper confidence bound. Adjusts exploitation/exploration. Increasing kappa will increase the importance that uncertainty (unexplored space) has, therefore incentivising exploration. This number represents the standard deviations above 0 of your upper confidence bound. Default is 2.56, which corresponds to the ~99th percentile.
tunable parameter epsilon of ei, eips and poi. Adjusts exploitation/exploration. This value is added to y_max after the scaling, so should between -0.1 and 0.1. Increasing eps will make the "improvement" threshold for new points higher, therefore incentivising exploitation.
integer that specifies how many initial points to try when searching for the optimum of the acquisition function. Increase this for a higher chance to find global optimum, at the expense of more time.
convergence threshold passed to factr
when the
optim
function (L-BFGS-B) is called. Lower values will take longer
to converge, but may be more accurate.
number 0-1. Represents the minimum percentage
of the optimal utility required for a less optimal local maximum to
be included as a candidate parameter set in the next scoring function.
If NULL
, only the global optimum will be used as a candidate
parameter set. If 0.5, only local optimums with 50 percent of the global
optimum will be used.
Depreciated. Noise is added in increasing amounts until unique parameter sets are found.
Should the progress of the Bayesian optimization be printed? Top graph shows the score(s) obtained at each iteration. The bottom graph shows the optimal value of the acquisition function at each iteration. This is useful to display how much utility the Gaussian Process is actually assuming still exists. If your utility is approaching 0, then you can be confident you are close to an optimal parameter set.
Whether or not to print progress to the console. If 0, nothing will be printed. If 1, progress will be printed. If 2, progress and information about new parameter-score pairs will be printed.
A list containing details about the process:
The last Gaussian process run on the parameter-score pairs
If acq = "eips"
, this contains the last Gaussian Process run on the parameter-elapsed time pairs
a Plotly chart showing the evolution of the scores and utility discovered during the Bayesian optimization
A list of all parameter-score pairs, as well as extra columns from FUN. gpUtility is the acquisition function value at the time that parameter set was tested. acqOptimum is a boolean column that specifies whether the parameter set was an acquisition function optimum, or if it was obtained by applying noise to another optimum. Elapsed is the amount of time in seconds it took FUN to evaluate that parameter set.
The best parameter set at each iteration
Jasper Snoek, Hugo Larochelle, Ryan P. Adams (2012) Practical Bayesian Optimization of Machine Learning Algorithms