alcGP: Improvement statistics for sequential or local design

Description

Calculate the active learning Cohn (ALC) statistic, mean-squared predictive error (MSPE) or expected Fisher information (fish) for a Gaussian process (GP) predictor relative to a set of reference locations, towards sequential design or local search for Gaussian process regression

Usage

alcGP(gpi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), 
      verb = 0)
alcGPsep(gpsepi, Xcand, Xref = Xcand, parallel = c("none", "omp", "gpu"), 
      verb = 0)
alcrayGP(gpi, Xref, Xstart, Xend, verb = 0)
alcrayGPsep(gpsepi, Xref, Xstart, Xend, verb = 0)
ieciGP(gpi, Xcand, fmin, Xref = Xcand, w = NULL, verb = 0)
ieciGPsep(gpsepi, Xcand, fmin, Xref = Xcand, w = NULL, verb = 0)
mspeGP(gpi, Xcand, Xref = Xcand, fi = TRUE, verb = 0)
fishGP(gpi, Xcand)

Arguments

gpi

a C-side GP object identifier (positive integer); e.g., as returned by newGP

gpsepi

a C-side separable GP object identifier (positive integer); e.g., as returned by newGPsep

Xcand

a matrix or data.frame containing a design of candidate predictive locations at which the ALC (or other) criteria is (are) evaluated. In the context of laGP, these are the possible locations for adding into the current local design

fmin

for ieci* only: a scalar value indicating the value of the best minimum found so far. This is usually set to the minimum of the Z-values stored in the gpi or gpsepi reference (for deterministic/low nugget settings), or otherwise the predicted mean value at the X locations

Xref

a matrix or data.frame containing a design of reference locations for ALC or MSPE. I.e., these are the locations at which the reduction in variance, or mean squared predictive error, are calculated. In the context of laGP, this is the single location Xref = x around which a local design is sought. For alcrayGP and alcrayGPsep the matrix may only have one row, i.e., one reference location

parallel

a switch indicating if any parallel calculation of the criteria (method) is desired. For parallel = "omp", the package be compiled with OpenMP flags; for parallel = "gpu", the package must be compiled with CUDA flags (only the ALC criteria is supported on the GPU); see README/INSTALL in the package source for more details

Xstart

a 1-by-ncol(Xref) starting location for a search along a ray between Xstart and Xend

Xend

a 1-by-ncol(Xref) ending location for a search along a ray between Xstart and Xend

a scalar logical indicating if the expected Fisher information portion of the expression (MSPE is essentially ALC + c(x)*EFI) should be calculated (TRUE) or set to zero (FALSE). This flag is mostly for error checking against the other functions, alcGP and fishGP, since the constituent parts are separately available via those functions

weights on the reference locations Xref for IECI calculations; IECI is not fully documented at this time

verb

a positive integer specifying the verbosity level; verb = 0 is quiet, and larger values cause more progress information to be printed to the screen

Value

A vector of length nrow(Xcand) is returned filled with values corresponding to the desired statistic

Details

The best way to see how these functions are used in the context of local approximation is to inspect the code in the laGP.R function.

Otherwise they are pretty self explanatory. They evaluate the ALC, MSPE, and EFI quantities outlined in Gramacy & Apley (2015). The ALC is originally due to Seo, et al. (2000). The ray-based search is described by a forthcoming paper by Gramacy & Haaland (2015).

MSPE and EFI calculations are not supported for separable GP models. I.e., there are no mspeGPsep or fishGPsep functions.

alcrayGP and alcrayGPsep allows only one reference location (nrow(Xref) = 1).

Note that ieciGP and ieciGPsep are alpha functionality and are not fully documented at this time

References

R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R., Journal of Statistical Software, 72(1), 1-46; or see vignette("laGP")

R.B. Gramacy and D.W. Apley (2015). Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics, 24(2), pp. 561-678; preprint on arXiv:1303.0383; http://arxiv.org/abs/1303.0383

R.B. Gramacy, J. Niemi, R.M. Weiss (2014). Massively parallel approximate Gaussian process regression. SIAM/ASA Journal on Uncertainty Quantification, 2(1), pp. 568-584; preprint on arXiv:1310.5182; http://arxiv.org/abs/1310.5182

R.B. Gramacy and B. Haaland (2015). Speeding up neighborhood search in local Gaussian process prediction. Technometrics, to appear; preprint on arXiv:1409.0074 http://arxiv.org/abs/1409.0074

Seo, S., Wallat, M., Graepel, T., Obermayer, K. (2000). Gaussian Process Regression: Active Data Selection and Test Point Rejection. In Proceedings of the International Joint Conference on Neural Networks, vol. III, 241-246. IEEE.

Examples

Run this code

## this follows the example in predGP, but only evaluates 
## information statistics documented here

## Simple 2-d test function used in Gramacy & Apley (2015);
## thanks to Lee, Gramacy, Taddy, and others who have used it before
f2d <- function(x, y=NULL)
  {
    if(is.null(y)) {
      if(!is.matrix(x)) x <- matrix(x, ncol=2)
      y <- x[,2]; x <- x[,1]
    }
    g <- function(z)
      return(exp(-(z-1)^2) + exp(-0.8*(z+1)^2) - 0.05*sin(8*(z+0.1)))
    z <- -g(x)*g(y)
  }

## design with N=441
x <- seq(-2, 2, length=11)
X <- as.matrix(expand.grid(x, x))
Z <- f2d(X)

## fit a GP
gpi <- newGP(X, Z, d=0.35, g=1/1000, dK=TRUE)

## predictive grid with NN=400
xx <- seq(-1.9, 1.9, length=20)
XX <- as.matrix(expand.grid(xx, xx))

## predict
alc <- alcGP(gpi, XX)
mspe <- mspeGP(gpi, XX)
fish <- fishGP(gpi, XX)

## visualize the result
par(mfrow=c(1,3))
image(xx, xx, matrix(sqrt(alc), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="sqrt ALC")
image(xx, xx, matrix(sqrt(mspe), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="sqrt MSPE")
image(xx, xx, matrix(log(fish), nrow=length(xx)), col=heat.colors(128),
      xlab="x1", ylab="x2", main="log fish")

## clean up
deleteGP(gpi)

Run the code above in your browser using DataLab