laGP (version 1.5-5)

# alcGP: Improvement statistics for sequential or local design

## Description

Calculate the active learning Cohn (ALC) statistic, mean-squared predictive error (MSPE) or expected Fisher information (fish) for a Gaussian process (GP) predictor relative to a set of reference locations, towards sequential design or local search for Gaussian process regression

## Usage

```alcGP(gpi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"),
verb = 0)
alcGPsep(gpsepi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"),
verb = 0)
alcrayGP(gpi, Xref, Xstart, Xend, verb = 0)
alcrayGPsep(gpsepi, Xref, Xstart, Xend, verb = 0)
ieciGP(gpi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0)
ieciGPsep(gpsepi, Xcand, fmin, Xref = Xcand, w = NULL, nonug = FALSE, verb = 0)
mspeGP(gpi, Xcand, Xref = Xcand, fi = TRUE, verb = 0)
fishGP(gpi, Xcand)
alcoptGP(gpi, Xref, start, lower, upper, maxit = 100, verb = 0)
alcoptGPsep(gpsepi, Xref, start, lower, upper, maxit = 100, verb = 0)
dalcGP(gpi, Xcand, Xref = Xcand, verb = 0)
dalcGPsep(gpsepi, Xcand, Xref = Xcand, verb = 0)```

## Arguments

gpi

a C-side GP object identifier (positive integer); e.g., as returned by `newGP`

gpsepi

a C-side separable GP object identifier (positive integer); e.g., as returned by `newGPsep`

Xcand

a `matrix` or `data.frame` containing a design of candidate predictive locations at which the ALC (or other) criteria is (are) evaluated. In the context of `laGP`, these are the possible locations for adding into the current local design

fmin

for `ieci*` only: a scalar value indicating the value of the best minimum found so far. This is usually set to the minimum of the `Z`-values stored in the `gpi` or `gpsepi` reference (for deterministic/low nugget settings), or otherwise the predicted mean value at the `X` locations

Xref

a `matrix` or `data.frame` containing a design of reference locations for ALC or MSPE. I.e., these are the locations at which the reduction in variance, or mean squared predictive error, are calculated. In the context of `laGP`, this is the single location, or set of reference locations, around which a local design (for accurate prediction) is sought. For `alcrayGP` and `alcrayGPsep` the `matrix` may only have one row, i.e., one reference location

parallel

a switch indicating if any parallel calculation of the criteria (`method`) is desired. For `parallel = "omp"`, the package must be compiled with OpenMP flags; for `parallel = "gpu"`, the package must be compiled with CUDA flags (only the ALC criteria is supported on the GPU); see README/INSTALL in the package source for more details

Xstart

a `1`-by-`ncol(Xref)` starting location for a search along a ray between `Xstart` and `Xend`

Xend

a `1`-by-`ncol(Xref)` ending location for a search along a ray between `Xstart` and `Xend`

fi

a scalar logical indicating if the expected Fisher information portion of the expression (MSPE is essentially `ALC + c(x)*EFI`) should be calculated (`TRUE`) or set to zero (`FALSE`). This flag is mostly for error checking against the other functions, `alcGP` and `fishGP`, since the constituent parts are separately available via those functions

w

weights on the reference locations `Xref` for IECI calculations; IECI, which stands for Integrated Expected Conditional Improvement, is not fully documented at this time. See Gramacy & Lee (2010) for more details.

nonug

a scalar logical indicating if a (nonzero) nugget should be used in the predictive equations behind IECI calculations; this allows the user to toggle improvement via predictive mean uncertainty versus full predictive uncertainty. The latter (default `nonug = FALSE`) is the standard approach, but the former may work better (citation forthcoming)

verb

a non-negative integer specifying the verbosity level; `verb = 0` is quiet, and larger values cause more progress information to be printed to the screen

start

initial values to the derivative-based search via `"L-BFGS-B"` within `alcoptGP` and `alcoptGPsep`; a nearest neighbor often represents a sensible initialization

lower, upper

bounds on the derivative-based search via `"L-BFGS-B"` within `alcoptGP` and `alcoptGPsep`

maxit

the maximum number of iterations (default `maxit=100`) in `"L-BFGS-B"` search within `alcoptGP` and `alcoptGPsep`

## Value

Except for `alcoptGP`, `alcoptGPsep`, `dalcGP`, and `dalcGPsep`, a vector of length `nrow(Xcand)` is returned filled with values corresponding to the desired statistic

par

the best set of parameters/input configuration found on optimization

value

the optimized objective value corresponding to output `par`

its

a two-element integer vector giving the number of calls to the object function and the gradient respectively.

msg

a character string giving any additional information returned by the optimizer, or `NULL`

convergence

An integer code. `0` indicates successful completion. For the other error codes, see the documentation for `optim`

alcs

reduced predictive variance averaged over the reference locations

dalcs

the derivative of `alcs` with respect to the new location

## Details

The best way to see how these functions are used in the context of local approximation is to inspect the code in the `laGP.R` function.

Otherwise they are pretty self-explanatory. They evaluate the ALC, MSPE, and EFI quantities outlined in Gramacy & Apley (2015). ALC is originally due to Seo, et al. (2000). The ray-based search is described by Gramacy & Haaland (2015).

MSPE and EFI calculations are not supported for separable GP models, i.e., there are no `mspeGPsep` or `fishGPsep` functions.

`alcrayGP` and `alcrayGPsep` allow only one reference location (`nrow(Xref) = 1`). `alcoptGP` and `alcoptGPsep` allow multiple reference locations. These optimize a continuous ALC analog in its natural logarithm using the starting locations, bounding boxes and (stored) GP provided by `gpi` or `gpisep`, and finally snaps the solution back to the candidate grid. For details, see Sun, et al. (2017).

Note that `ieciGP` and `ieciGPsep`, which are for optimization via integrated expected conditional improvement (Gramacy & Lee, 2011) are “alpha” functionality and are not fully documented at this time.

## References

F. Sun, R.B. Gramacy, B. Haaland, E. Lawrence, and A. Walker (2019). Emulating satellite drag from large simulation experiments, SIAM/ASA Journal on Uncertainty Quantification, 7(2), pp. 720-759; preprint on arXiv:1712.00182; http://arxiv.org/abs/1712.00182

R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R, Journal of Statistical Software, 72(1), 1-46; or see `vignette("laGP")`

R.B. Gramacy and B. Haaland (2016). Speeding up neighborhood search in local Gaussian process prediction, Technometrics, 58(3), pp. 294-303; preprint on arXiv:1409.0074; http://arxiv.org/abs/1409.0074

R.B. Gramacy and D.W. Apley (2015). Local Gaussian process approximation for large computer experiments, Journal of Computational and Graphical Statistics, 24(2), pp. 561-678; preprint on arXiv:1303.0383; http://arxiv.org/abs/1303.0383

R.B. Gramacy, J. Niemi, R.M. Weiss (2014). Massively parallel approximate Gaussian process regression, SIAM/ASA Journal on Uncertainty Quantification, 2(1), pp. 568-584; preprint on arXiv:1310.5182; http://arxiv.org/abs/1310.5182

R.B. Gramacy, H.K.H. Lee (2011). Optimization under unknown constraints, Valencia discussion paper, in Bayesian Statistics 9. Oxford University Press; preprint on arXiv:1004.4027; http://arxiv.org/abs/1004.4027

S. Seo, M., Wallat, T. Graepel, K. Obermayer (2000). Gaussian Process Regression: Active Data Selection and Test Point Rejection, In Proceedings of the International Joint Conference on Neural Networks, vol. III, 241-246. IEEE

## See Also

`laGP`, `aGP`, `predGP`

## Examples

```# NOT RUN {
## this follows the example in predGP, but only evaluates
## information statistics documented here

## Simple 2-d test function used in Gramacy & Apley (2015);
## thanks to Lee, Gramacy, Taddy, and others who have used it before
f2d <- function(x, y=NULL)
{
if(is.null(y)) {
if(!is.matrix(x) && !is.data.frame(x)) x <- matrix(x, ncol=2)
y <- x[,2]; x <- x[,1]
}
g <- function(z)
return(exp(-(z-1)^2) + exp(-0.8*(z+1)^2) - 0.05*sin(8*(z+0.1)))
z <- -g(x)*g(y)
}

## design with N=441
x <- seq(-2, 2, length=11)
X <- expand.grid(x, x)
Z <- f2d(X)

## fit a GP
gpi <- newGP(X, Z, d=0.35, g=1/1000, dK=TRUE)

## predictive grid with NN=400
xx <- seq(-1.9, 1.9, length=20)
XX <- expand.grid(xx, xx)

## predict
alc <- alcGP(gpi, XX)
mspe <- mspeGP(gpi, XX)
fish <- fishGP(gpi, XX)

## visualize the result
par(mfrow=c(1,3))
image(xx, xx, matrix(sqrt(alc), nrow=length(xx)), col=heat.colors(128),
xlab="x1", ylab="x2", main="sqrt ALC")
image(xx, xx, matrix(sqrt(mspe), nrow=length(xx)), col=heat.colors(128),
xlab="x1", ylab="x2", main="sqrt MSPE")
image(xx, xx, matrix(log(fish), nrow=length(xx)), col=heat.colors(128),
xlab="x1", ylab="x2", main="log fish")

## clean up
deleteGP(gpi)

##
## Illustrating some of the other functions in a sequential design context,
## using X and XX above
##

## new, much bigger design
x <- seq(-2, 2, by=0.02)
X <- expand.grid(x, x)
Z <- f2d(X)

## first build a local design of size 25, see laGP documentation
out <- laGP.R(XX, start=6, end=25, X, Z, method="alc", close=10000)

## extract that design and fit GP
XC <- X[out\$Xi,] ## inputs
ZC <- Z[out\$Xi]  ## outputs
gpi <- newGP(XC, ZC, d=out\$mle\$d, g=out\$g\$start)

## calculate the ideal "next" location via continuous ALC optimization
alco <- alcoptGP(gpi=gpi, Xref=XX, start=c(0,0), lower=range(x), upper=range(x))

## alco\$par is the "new" location; calculate distances between candidates (remaining
## unchosen X locations) and this solution
Xcan <- X[-out\$Xi,]
D <- distance(Xcan, matrix(alco\$par, ncol=ncol(Xcan)))

## snap the new location back to the candidate set
lab <- which.min(D)
xnew <- Xcan[lab,]
## add xnew to the local design, remove it from Xcan, and repeat

## evaluate the derivative at this new location
dalc <- dalcGP(gpi=gpi, Xcand=matrix(xnew, nrow=1), Xref=XX)

## clean up
deleteGP(gpi)
# }
```