alcGP: Improvement statistics for sequential or local design

Description

Calculate the active learning Cohn (ALC) statistic, mean-squared predictive error (MSPE) or expected Fisher information (fish) for a Gaussian process (GP) predictor relative to a set of reference locations, towards sequential design or local search for Gaussian process regression

Usage

alcGP(gpi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), 
      verb = 0)
alcGPsep(gpsepi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), 
      verb = 0)
alcrayGP(gpi, Xref, Xstart, Xend, verb = 0)
alcrayGPsep(gpsepi, Xref, Xstart, Xend, verb = 0)
ieciGP(gpi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0)
ieciGPsep(gpsepi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0)
mspeGP(gpi, Xcand, Xref = Xcand, fi = TRUE, verb = 0)
fishGP(gpi, Xcand)
alcoptGP(gpi, Xref, start, lower, upper, maxit = 100, verb = 0)
alcoptGPsep(gpsepi, Xref, start, lower, upper, maxit = 100, verb = 0)
dalcGP(gpi, Xcand, Xref = Xcand, verb = 0)
dalcGPsep(gpsepi, Xcand, Xref = Xcand, verb = 0)

Arguments

gpi

a C-side GP object identifier (positive integer); e.g., as returned by newGP

gpsepi

a C-side separable GP object identifier (positive integer); e.g., as returned by newGPsep

Xcand

a matrix or data.frame containing a design of candidate predictive locations at which the ALC (or other) criteria is (are) evaluated. In the context of laGP, these are the possible locations for adding into the current local design

fmin

for ieci* only: a scalar value indicating the value of the best minimum found so far. This is usually set to the minimum of the Z-values stored in the gpi or gpsepi reference (for deterministic/low nugget settings), or otherwise the predicted mean value at the X locations

Xref

a matrix or data.frame containing a design of reference locations for ALC or MSPE. I.e., these are the locations at which the reduction in variance, or mean squared predictive error, are calculated. In the context of laGP, this is the single location, or set of reference locations, around which a local design (for accurate prediction) is sought. For alcrayGP and alcrayGPsep the matrix may only have one row, i.e., one reference location

parallel

a switch indicating if any parallel calculation of the criteria (method) is desired. For parallel = "omp", the package must be compiled with OpenMP flags; for parallel = "gpu", the package must be compiled with CUDA flags (only the ALC criteria is supported on the GPU); see README/INSTALL in the package source for more details

Xstart

a 1-by-ncol(Xref) starting location for a search along a ray between Xstart and Xend

Xend

a 1-by-ncol(Xref) ending location for a search along a ray between Xstart and Xend

a scalar logical indicating if the expected Fisher information portion of the expression (MSPE is essentially ALC + c(x)*EFI) should be calculated (TRUE) or set to zero (FALSE). This flag is mostly for error checking against the other functions, alcGP and fishGP, since the constituent parts are separately available via those functions

weights on the reference locations Xref for IECI calculations; IECI, which stands for Integrated Expected Conditional Improvement, is not fully documented at this time. See Gramacy & Lee (2010) for more details.

nonug

a scalar logical indicating if a (nonzero) nugget should be used in the predictive equations behind IECI calculations; this allows the user to toggle improvement via predictive mean uncertainty versus full predictive uncertainty. The latter (default nonug = FALSE) is the standard approach, but the former may work better (citation forthcoming)

verb

a non-negative integer specifying the verbosity level; verb = 0 is quiet, and larger values cause more progress information to be printed to the screen

start

initial values to the derivative-based search via "L-BFGS-B" within alcoptGP and alcoptGPsep; a nearest neighbor often represents a sensible initialization

lower, upper

bounds on the derivative-based search via "L-BFGS-B" within alcoptGP and alcoptGPsep

maxit

the maximum number of iterations (default maxit=100) in "L-BFGS-B" search within alcoptGP and alcoptGPsep

Value

Except for alcoptGP, alcoptGPsep, dalcGP, and dalcGPsep, a vector of length nrow(Xcand) is returned filled with values corresponding to the desired statistic

par

the best set of parameters/input configuration found on optimization

value

the optimized objective value corresponding to output par

its

a two-element integer vector giving the number of calls to the object function and the gradient respectively.

msg

a character string giving any additional information returned by the optimizer, or NULL

convergence

An integer code. 0 indicates successful completion. For the other error codes, see the documentation for optim

alcs

reduced predictive variance averaged over the reference locations

dalcs

the derivative of alcs with respect to the new location

Details

The best way to see how these functions are used in the context of local approximation is to inspect the code in the laGP.R function.

Otherwise they are pretty self-explanatory. They evaluate the ALC, MSPE, and EFI quantities outlined in Gramacy & Apley (2015). ALC is originally due to Seo, et al. (2000). The ray-based search is described by Gramacy & Haaland (2015).

MSPE and EFI calculations are not supported for separable GP models, i.e., there are no mspeGPsep or fishGPsep functions.

alcrayGP and alcrayGPsep allow only one reference location (nrow(Xref) = 1). alcoptGP and alcoptGPsep allow multiple reference locations. These optimize a continuous ALC analog in its natural logarithm using the starting locations, bounding boxes and (stored) GP provided by gpi or gpisep, and finally snaps the solution back to the candidate grid. For details, see Sun, et al. (2017).

Note that ieciGP and ieciGPsep, which are for optimization via integrated expected conditional improvement (Gramacy & Lee, 2011) are “alpha” functionality and are not fully documented at this time.

References

F. Sun, R.B. Gramacy, B. Haaland, E. Lawrence, and A. Walker (2019). Emulating satellite drag from large simulation experiments, SIAM/ASA Journal on Uncertainty Quantification, 7(2), pp. 720-759; preprint on arXiv:1712.00182; http://arxiv.org/abs/1712.00182

R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R, Journal of Statistical Software, 72(1), 1-46; or see vignette("laGP")

R.B. Gramacy and B. Haaland (2016). Speeding up neighborhood search in local Gaussian process prediction, Technometrics, 58(3), pp. 294-303; preprint on arXiv:1409.0074; http://arxiv.org/abs/1409.0074

R.B. Gramacy and D.W. Apley (2015). Local Gaussian process approximation for large computer experiments, Journal of Computational and Graphical Statistics, 24(2), pp. 561-678; preprint on arXiv:1303.0383; http://arxiv.org/abs/1303.0383

R.B. Gramacy, J. Niemi, R.M. Weiss (2014). Massively parallel approximate Gaussian process regression, SIAM/ASA Journal on Uncertainty Quantification, 2(1), pp. 568-584; preprint on arXiv:1310.5182; http://arxiv.org/abs/1310.5182

R.B. Gramacy, H.K.H. Lee (2011). Optimization under unknown constraints, Valencia discussion paper, in Bayesian Statistics 9. Oxford University Press; preprint on arXiv:1004.4027; http://arxiv.org/abs/1004.4027

S. Seo, M., Wallat, T. Graepel, K. Obermayer (2000). Gaussian Process Regression: Active Data Selection and Test Point Rejection, In Proceedings of the International Joint Conference on Neural Networks, vol. III, 241-246. IEEE

Examples

Run this code

# NOT RUN {
## this follows the example in predGP, but only evaluates 
## information statistics documented here

## Simple 2-d test function used in Gramacy & Apley (2015);
## thanks to Lee, Gramacy, Taddy, and others who have used it before
f2d <- function(x, y=NULL)
  {
    if(is.null(y)) {
      if(!is.matrix(x) && !is.data.frame(x)) x <- matrix(x, ncol=2)
      y <- x[,2]; x <- x[,1]
    }
    g <- function(z)
      return(exp(-(z-1)^2) + exp(-0.8*(z+1)^2) - 0.05*sin(8*(z+0.1)))
    z <- -g(x)*g(y)
  }

## design with N=441
x <- seq(-2, 2, length=11)
X <- expand.grid(x, x)
Z <- f2d(X)

## fit a GP
gpi <- newGP(X, Z, d=0.35, g=1/1000, dK=TRUE)

## predictive grid with NN=400
xx <- seq(-1.9, 1.9, length=20)
XX <- expand.grid(xx, xx)

## predict
alc <- alcGP(gpi, XX)
mspe <- mspeGP(gpi, XX)
fish <- fishGP(gpi, XX)

## visualize the result
par(mfrow=c(1,3))
image(xx, xx, matrix(sqrt(alc), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="sqrt ALC")
image(xx, xx, matrix(sqrt(mspe), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="sqrt MSPE")
image(xx, xx, matrix(log(fish), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="log fish")

## clean up
deleteGP(gpi)


## 
## Illustrating some of the other functions in a sequential design context, 
## using X and XX above
## 

## new, much bigger design
x <- seq(-2, 2, by=0.02)
X <- expand.grid(x, x)
Z <- f2d(X)

## first build a local design of size 25, see laGP documentation
out <- laGP.R(XX, start=6, end=25, X, Z, method="alc", close=10000)

## extract that design and fit GP
XC <- X[out$Xi,] ## inputs
ZC <- Z[out$Xi]  ## outputs
gpi <- newGP(XC, ZC, d=out$mle$d, g=out$g$start)

## calculate the ideal "next" location via continuous ALC optimization
alco <- alcoptGP(gpi=gpi, Xref=XX, start=c(0,0), lower=range(x)[1], upper=range(x)[2])

## alco$par is the "new" location; calculate distances between candidates (remaining
## unchosen X locations) and this solution
Xcan <- X[-out$Xi,]
D <- distance(Xcan, matrix(alco$par, ncol=ncol(Xcan))) 

## snap the new location back to the candidate set
lab <- which.min(D) 
xnew <- Xcan[lab,] 
## add xnew to the local design, remove it from Xcan, and repeat

## evaluate the derivative at this new location
dalc <- dalcGP(gpi=gpi, Xcand=matrix(xnew, nrow=1), Xref=XX)

## clean up
deleteGP(gpi)
# }

Run the code above in your browser using DataLab