optim.auglag: Optimize an objective function under multiple blackbox constraints

Description

Uses a surrogate modeled augmented Lagrangian (AL) system to optimize an objective function (blackbox or known and linear) under unknown multiple (blackbox) constraints via expected improvement (EI) and variations; a comparator based on EI with constraints is also provided

Usage

optim.auglag(fn, B, fhat = FALSE, cknown = NULL, start = 10, end = 100,  Xstart = NULL, sep=FALSE, ab = c(3/2, 4), lambda = 1, rho = 1/2, urate = 10,  ncandf = function(t) { t }, dg.start = c(0.1, 1e-06),  dlim = sqrt(ncol(B)) * c(1/100, 10), Bscale = 1, ey.tol = 0.05,  nomax = FALSE, N = 1000, plotprog = FALSE, verb = 2, ...)
optim.eic(fn, B, fhat = FALSE, cknown = NULL, start = 10, end = 100,  Xstart = NULL, sep = FALSE, ab = c(3/2,4), urate = 10,  ncandf = function(t) { t }, dg.start = c(0.1,1e-6),  dlim = sqrt(ncol(B))*c(1/100,10), Bscale = 1, plotprog = FALSE,  verb = 2, ...)

Arguments

function of an input (x), facilitating vectorization on a matrix X thereof, returning a list with elements "obj" containing the (scalar) objective value and "c" containing a vector of evaluations of the (multiple) constraint function at x. The fn function must take a known.only argument which is explained in the note below; it need not act on that argument

2-column matrix describing the bounding box. The number of rows of the matrix determines the input dimension (length(x) in fn(x)); the first column gives lower bounds and the second gives upper bounds

fhat

a scalar logical indicating if the objective function should be modeled with a GP surrogate. The default of FALSE assumes a known linear objective scaled by Bscale. Using TRUE is an “alpha” feature at this time

cknown

A optional positive integer vector specifying which of the constraint values returned by fn should be treated as “known”, i.e., not modeled with Gaussian processes

start

positive integer giving the number of random starting locations before sequential design (for optimization) is performed; start >= 6 is recommended unless Xstart is non-NULL; in the current version the starting locations come from a space-filling design via dopt.gp

end

positive integer giving the total number of evaluations/trials in the optimization; must have end > start

Xstart

optional matrix of starting design locations in lieu of, or in addition to, start random ones; we recommend nrow(Xstart) + start >= 6; also must have ncol(Xstart) = nrow(B)

sep

if sep = TRUE then separable GPs (i.e., via newGPsep, etc.) are used to model the constraints. Otherwise the default is to use isotropic ones

prior parameters; see darg describing the prior used on the lengthscale parameter during emulation(s) for the constraints

lambda

m-dimensional initial Lagrange multiplier parameter for m-constraints

rho

positive scalar initial quadratic penalty parameter in the augmented Lagrangian

urate

positive integer indicating how many optimization trials should pass before each MLE/MAP update is performed for GP correlation lengthscale parameter(s)

ncandf

function taking a single integer indicating the optimization trial number t, where

start < t <= end<="" code="">, and returning the number of search candidates (e.g., for
    expected improvement calculations) at round t; the default setting
    allows the number of candidates to grow linearly with t

dg.start

2-vector giving starting values for the lengthscale and nugget parameters of the GP surrogate model(s) for constraints

dlim

2-vector giving bounds for the lengthscale parameter(s) under MLE/MAP inference, thereby augmenting the prior specification in ab

Bscale

scalar indicating the relationship between the sum of the inputs, sum(x), to fn and the output fn(x)$obj$; note that at this time only linear objectives are fully supported by the code - more details below

ey.tol

a scalar proportion indicating how many of the EIs at ncandf(t) candidate locations must be non-zero to “trust” that metric to guide search, reducing to an EY-based search instead [choosing that proportion to be zero forces EY-based search]

nomax

one of c{-1,0,1} indicating if the max should be removed from the augmented lagrangian (AL): not at all (0), in the evaluation of EI or EY (1). Specifying a negative number (e.g., -1) invokes the slack variable implementation of AL, which is “alpha” functionality at this time

positive scalar integer indicating the number of Monte Carlo samples to be used for calculating EI and EY

plotprog

logical indicating if progress plots should be made after each inner iteration; the plots show three panels tracking the best valid objective, the EI or EY surface over the first two input variables (requires interp, and the parameters of the lengthscale(s) of the GP(s) respectively. When plotprog = TRUE the interp.loess function is used to aid in creating surface plots, however this does not work well with fewer than fifteen points. You may also provide a function as an argument, having similar arguments/formals as interp.loess. For example, we use interp below, which would have been the default if not for licensing incompatibilities

verb

positive scalar integer indicating the verbosity level; the larger the value the more that is printed to the screen

...

additional arguments passed to fn

Value

prog: vector giving the best valid (c(x) < 0) value of the objective over the trials
obj: vector giving the value of the objective for the input under consideration at each trial
X: matrix giving the input values at which the blackbox function was evaluated
C: matrix giving the value of the constraint function for the input under consideration at each trial (corresponding to X above)
d: matrix of lengthscale values obtained at the final update of the GP emulator for each constraint
df: if fhat = TRUE then this is a matrix of lengthscale values for the objective obtained at the final update of the GP emulator
lambda: a matrix containing lambda vectors used in each outer loop iteration
rho: a vector of rho values used in each outer loop iteration

Details

In its current form, this is a “beta” code illustrating the suite of methods used to optimize two challenging constrained optimization problems from Gramacy, et al. (2015); see references below.

That scheme hybridizes Gaussian process based surrogate modeling and expected improvement (EI; Jones, et., al, 2008) with the additive penalty method (APM) implemented by the augmented Lagrangian (AL, e.g., Nocedal & Wright, 2006). The goal is to minimize a known linear objective function f(x) under multiple, unknown (blackbox) constraint functions satisfying c(x) <= 0<="" code="">, which is vector-valued. The solution here emulates the components of c with Gaussian process surrogates, and guides optimization by searching the posterior mean surface, or the EI of, the following composite objective $$ Y(x) = f(x) + \lambda^\top Y_c(x) + \frac{1}{2\rho} \sum_{i=1}^m \max(0, Y_{c_i}(x))^2, $$ where $lambda$ and $rho$ follow updating equations that guarantee convergence to a valid solution minimizing the objective. For more details, see Gramacy, et al. (2015).

The nomax argument indicates whether or not the max is present in the AL formula above. Setting nomax > 0 can lead to a more aggressive search nearby the boundary between feasible and infeasible regions. See Gramacy, et al. (2015) for more details.

The example below illustrates a variation on the toy problem considered in that paper, which bases sampling on EI.

The latest version of this function allows an unknown objective function to be modeled (fhat = TRUE), rather than assuming a known linear one. This is “alpha” functionality at this time. For an example, see demo("ALfhat").

The optim.eic function is provided as a comparator. This method uses the same underlying GP models to with the hybrid EI and probability of satisfying the constraints heuristic from Schonlau, et al., (1998).

References

Gramacy, R.B, Gray, G.A, Lee, H.K.H, Le Digabel, S., Ranjan P., Wells, G., Wild, S.M. (2015) “Modeling an Augmented Lagrangian for Improved Blackbox Constrained Optimization”, Preprint available on arXiv:1403.4890; http://arxiv.org/abs/1403.4890

Jones, D., Schonlau, M., and Welch, W. J. (1998). “Efficient Global Optimization of Expensive Black Box Functions.” Journal of Global Optimization, 13, 455-492.

Schonlau, M., Jones, D.R., and Welch, W. J. (1998). “Global Versus Local Search in Constrained Optimization of Computer Models.” In New Developments and Applications in Experimental Design, vol. 34, 11-25. Institute of Mathematical Statistics.

Nocedal, J. and Wright, S.J. (2006). Numerical Optimization. 2nd ed. Springer.

Examples

Run this code

## this example assumes a known linear objective; further examples
## are in the optim.auglag demo

## a test function returning linear objective evaluations and 
## non-linear constraints
aimprob <- function(X, known.only = FALSE)
{
  if(is.null(nrow(X))) X <- matrix(X, nrow=1)
  f <- rowSums(X)
  if(known.only) return(list(obj=f))
  c1 <- 1.5-X[,1]-2*X[,2]-0.5*sin(2*pi*(X[,1]^2-2*X[,2]))
  c2 <- rowSums(X^2)-1.5
  return(list(obj=f, c=cbind(c1,c2)))
}

## set bounding rectangle for adaptive sampling
B <- matrix(c(rep(0,2),rep(1,2)),ncol=2)

## optimization (primarily) by EI, change 25 to 100 for
## 99% chance of finding the global optimum with value 0.6
library(akima) ## for plotprog=interp
out <- optim.auglag(aimprob, B, ab=c(3/2,8), end=25, plotprog=interp)
  

## for comparison, here is a version that uses simulated annealing
## where the Additive Penalty Method (APM) is used to handle
## the constraints
## Not run: 
# aimprob.apm <- function(x, B=matrix(c(rep(0,2),rep(1,2)),ncol=2))
# { 
#   ## check bounding box
#   for(i in 1:length(x)) {
#     if(x[i] < B[i,1] || x[i] > B[i,2]) return(Inf)
#   }
# 
#   ## evaluate objective and constraints
#   f <- sum(x)
#   c1 <- 1.5-x[1]-2*x[2]-0.5*sin(2*pi*(x[1]^2-2*x[2]))
#   c2 <- x[1]^2+x[2]^2-1.5
# 
#   ## return APM composite
#   return(f + abs(c1) + abs(c2))
# }
# 
# ## use SA; specify control=list(maxit=100), say, to control max 
# ## number of iterations; does not easily facilitate plotting progress
# out <- optim(runif(2), aimprob.apm, method="SANN") 
# ## check the final value, which typically does not satisfy both
# ## constraints
# aimprob(out$par)
# ## End(Not run)

## for a version with a modeled objective see demo("ALfhat")

Run the code above in your browser using DataLab