Facilitates localized Gaussian process inference and prediction at a large
set of predictive locations, by (essentially) calling laGP
at each location, and returning the moments of the predictive
equations, and indices into the design, thus obtained
aGP(X, Z, XX, start = 6, end = 50, d = NULL, g = 1/10000,
method = c("alc", "alcray", "mspe", "nn", "fish"), Xi.ret = TRUE,
close = min((1000+end)*if(method[1] == "alcray") 10 else 1, nrow(X)),
numrays = ncol(X), num.gpus = 0, gpu.threads = num.gpus,
omp.threads = if (num.gpus > 0) 0 else 1,
nn.gpu = if (num.gpus > 0) nrow(XX) else 0, verb = 1)
aGP.parallel(cls, XX, chunks = length(cls), X, Z, start = 6, end = 50,
d = NULL, g = 1/10000, method = c("alc", "alcray", "mspe", "nn", "fish"),
Xi.ret = TRUE,
close = min((1000+end)*if(method[1] == "alcray") 10 else 1, nrow(X)),
numrays = ncol(X), num.gpus = 0, gpu.threads = num.gpus,
omp.threads = if (num.gpus > 0) 0 else 1,
nn.gpu = if (num.gpus > 0) nrow(XX) else 0, verb = 1)
aGP.R(X, Z, XX, start = 6, end = 50, d = NULL, g = 1/10000,
method = c("alc", "alcray", "mspe", "nn", "fish"), Xi.ret = TRUE,
close = min((1000+end) *if(method[1] == "alcray") 10 else 1, nrow(X)),
numrays = ncol(X), laGP=laGP.R, verb = 1)
aGPsep(X, Z, XX, start = 6, end = 50, d = NULL, g = 1/10000,
method = c("alc", "alcray", "nn"), Xi.ret = TRUE,
close = min((1000+end)*if(method[1] == "alcray") 10 else 1, nrow(X)),
numrays = ncol(X), omp.threads = 1, verb = 1)
aGPsep.R(X, Z, XX, start = 6, end = 50, d = NULL, g = 1/10000,
method = c("alc", "alcray", "nn"), Xi.ret = TRUE,
close = min((1000+end)*if(method[1] == "alcray") 10 else 1, nrow(X)),
numrays = ncol(X), laGPsep=laGPsep.R, verb = 1)
aGP.seq(X, Z, XX, d, methods=rep("alc", 2), M=NULL, ncalib=0, ...)
a matrix
or data.frame
containing
the full (large) design matrix of input locations
a vector of responses/dependent values with length(Z) = nrow(X)
a matrix
or data.frame
of out-of-sample
predictive locations with ncol(XX) = ncol(X)
; aGP
calls laGP
for
each row of XX
as a value of Xref
, independently
the number of Nearest Neighbor (NN) locations to start each
independent call to laGP
with; must have start >= 6
the total size of the local designs; start < end
a prior or initial setting for the lengthscale
parameter in a Gaussian correlation function; a (default)
NULL
value triggers a sensible regularization (prior) and
initial setting to be generated via darg
;
a scalar specifies an initial value, causing darg
to only generate the prior; otherwise,
a list or partial list matching the output
of darg
can be used to specify a custom prior. In the
case of a partial list, only the missing entries will be
generated. Note that a default/generated list specifies MLE/MAP
inference for this parameter. When specifying initial values, a
vector of length nrow(XX)
can be provided, giving a
different initial value for each predictive location. With
aGPsep
, the starting values can be an
ncol(X)
-by-nrow(XX)
matrix
or an ncol(X)
vector
a prior or initial setting for the nugget parameter; a
NULL
value causes a sensible regularization (prior) and
initial setting to be generated via garg
; a scalar
(default g = 1/10000
) specifies an initial value, causing garg
to only generate the prior; otherwise, a
list or partial list matching the output of garg
can be used to
specify a custom prior. In the case of a partial list, only the
missing entries will be generated. Note that a default/generated list
specifies no inference for this parameter; i.e., it is fixed
at its starting or default value, which may be appropriate for
emulating deterministic computer code output. In such situations a
value much smaller than the default may work even better (i.e.,
yield better out-of-sample predictive performance).
The default was chosen conservatively. When specifying non-default initial
values, a vector of length nrow(XX)
can be provided, giving a different
initial value for each predictive location
specifies the method by which end-start
candidates from
X
are chosen in order to predict at each row XX
independently.
In brief, ALC ("alc"
, default) minimizes predictive variance;
ALCRAY ("alcray")
executes a thrifty search focused on rays emanating
from the reference location(s); MSPE
("mspe"
) augments ALC with extra derivative information to
minimize mean-squared prediction error (requires extra computation);
NN ("nn"
) uses nearest neighbor; and ("fish"
) uses
the expected Fisher information - essentially 1/G
from
Gramacy & Apley (2015) - which is global heuristic, i.e., not
localized to each row of XX
for aGP.seq
this is a vectorized method
argument,
containing a list of valid methods to perform in sequence. When methods = FALSE
a call to M
is invoked instead; see below for
more details
a scalar logical indicating whether or not a matrix
of indices
into X
, describing the chosen sub-design for each of the
predictive locations in XX
, should be returned on
output
a non-negative integer end < close <= nrow(X)
specifying the number of NNs
(to each row XX
) in X
to consider when
searching for the sub-design; close = 0
specifies all.
For method="alcray"
this
specifies the scope used to snap ray-based solutions back to
elements of X
, otherwise there are no restrictions on that
search
a scalar integer indicating the number of rays for each
greedy search; only relevant when method="alcray"
. More rays
leads to a more thorough, but more computationally intensive search
applicable only to the C-version aGP
, this is a scalar
positive integer indicating the number of GPUs available for calculating
ALC (see alcGP
); the package must be compiled for CUDA support;
see README/INSTALL in the package source for more details
applicable only to the C-version aGP
; this
is a scalar positive integer indicating the number of SMP (i.e., CPU)
threads queuing ALC jobs on a GPU; the package must be compiled for CUDA support.
If gpu.threads >= 2
then the package must also
be compiled for OpenMP support; see README/INSTALL in the package source for
more details. We recommend setting gpu.threads
to up to two-times
the sum of the number of GPU devices and CPU cores.
Only method = "alc"
is supported when using CUDA. If the
sum of omp.threads
and gpu.threads
is bigger than the
max allowed by your system, then that max is used instead (giving
gpu.threads
preference)
applicable only to the C-version aGP
;
this is a scalar positive integer indicating the number
of threads to use for SMP parallel processing; the package must be
compiled for OpenMP support; see README/INSTALL in the package source for
more details. For most Intel-based machines, we recommend setting
omp.threads
to up to two-times the number of hyperthreaded cores. When
using GPUs (num.gpu > 0
), a good default is omp.threads=0
,
otherwise load balancing could be required; see nn.gpu
below. If the
sum of omp.threads
and gpu.threads
is bigger than the
max allowed by your system, then that max is used instead (giving
gpu.threads
preference)
a scalar non-negative integer between 0
and
nrow(XX)
indicating the number of predictive locations
utilizing GPU ALC calculations. Note this argument is only useful when
both gpu.threads
and omp.threads
are non-zero, whereby
it acts as a load balancing mechanism
a non-negative integer specifying the verbosity level; verb = 0
is quiet, and larger values cause more progress information to be
printed to the screen. The value min(0,verb-1)
is provided
to each laGP
call
a cluster object created by makeCluster
from the parallel or snow packages
a scalar integer indicating the number of chunks to break XX
into
for parallel evaluation on cluster cls
.
Usually chunks = length(cl)
is appropriate.
However specifying more chunks can be useful when the nodes of
the cluster are not homogeneous
an optional function taking two matrix inputs, of ncol(X)-ncalib
and ncalib
columns respectively, which is applied in lieu of
aGP
. This can be useful for calibration where the computer model
is cheap, i.e., does not require emulation; more details below
an integer between 1 and ncol(X)
indicating how to
partition X
and XX
inputs into the two matrices required for M
other arguments passed from aGP.sep
to aGP
The output is a list
with the following components.
a vector of predictive means of length nrow(XX)
a vector of predictive variances of length
nrow(Xref)
a vector indicating the log likelihood/posterior
probability of the data/parameter(s) under the chosen sub-design for
each predictive location in XX
; provided up to an additive constant
a scalar giving the passage of wall-clock time elapsed for (substantive parts of) the calculation
a copy of the method
argument
a full-list version of the d
argument, possibly completed by darg
a full-list version of the g
argument, possibly
completed by garg
if d$mle
and/or g$mle
are TRUE
, then
mle
is a data.frame
containing the values found for
these parameters, and the number of required iterations, for each
predictive location in XX
when Xi.ret = TRUE
, this field contains a matrix
of
indices of length end
into X
indicating the sub-design
chosen for each predictive location in XX
a copy of the input argument
The aGP.seq function only returns the output from the final aGP call. When methods = FALSE and M is supplied, the returned object is a data frame with a mean column indicating the output of the computer model run, and a var column, which at this time is zero
This function invokes laGP
with argument Xref
= XX[i,]
for each i=1:nrow(XX)
, building up local designs,
inferring local correlation parameters, and
obtaining predictive locations independently for each location. For
more details see laGP
.
The function aGP.R
is a prototype R-only version for
debugging and transparency purposes. It is slower than
aGP
, which is primarily in C. However it may be
useful for developing new programs that involve similar subroutines.
Note that aGP.R
may provide different output than aGP
due to differing library subroutines deployed in R and C.
The function aGP.parallel
allows aGP
to be called on segments
of the XX
matrix distributed to a cluster created by parallel.
It breaks XX
into chunks
which are sent to aGP
workers pointed to by the entries of cls
. The aGP.parallel
function
collects the outputs from each chunk before returning an object
almost identical to what would have been returned from a single aGP
call. On a single (SMP) node, this represents is a poor-man's version of
the OpenMP version described below. On multiple nodes both can be used.
If compiled with OpenMP flags, the independent calls to
laGP
will be
farmed out to threads allowing them to proceed in parallel - obtaining
nearly linear speed-ups. At this time aGP.R
does not
facilitate parallel computation, although a future version may exploit
the parallel functionality for clustered parallel execution.
If num.gpus > 0
then the ALC part of the independent
calculations performed by each thread will be offloaded to a GPU.
If both gpu.threads >= 1
and omp.threads >= 1
,
some of the ALC calculations will be done on the GPUs, and some
on the CPUs. In our own experimentation we have not found this
to lead to large speedups relative to omp.threads = 0
when
using GPUs. For more details, see Gramacy, Niemi, & Weiss (2014).
The aGP.sep
function is provided primarily for use in calibration
exercises, see Gramacy, et al. (2015). It automates a sequence of
aGP
calls, each with a potentially different method,
successively feeding the previous estimate of local lengthscale (d
)
in as an initial set of values for the next call. It also allows the
use of aGP
to be bypassed, feeding the inputs into a user-supplied
M
function instead. This feature is enabled when
methods = FALSE
. The M
function takes two matrices
(same number of rows) as inputs, where the first ncol(X) - ncalib
columns represent
“field data” inputs shared by the physical and computer model
(in the calibration context), and the remaining ncalib
are
the extra tuning or calibration parameters required to evalue the computer
model. For examples illustrating aGP.seq
please see the
documentation file for discrep.est
and demo("calib")
R.B. Gramacy (2016). laGP: Large-Scale Spatial Modeling via
Local Approximate Gaussian Processes in R., Journal of Statistical
Software, 72(1), 1-46; or see vignette("laGP")
R.B. Gramacy and D.W. Apley (2015). Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics, 24(2), pp. 561-678; preprint on arXiv:1303.0383; http://arxiv.org/abs/1303.0383
R.B. Gramacy, J. Niemi, R.M. Weiss (2014). Massively parallel approximate Gaussian process regression. SIAM/ASA Journal on Uncertainty Quantification, 2(1), pp. 568-584; preprint on arXiv:1310.5182; http://arxiv.org/abs/1310.5182
R.B. Gramacy and B. Haaland (2016). Speeding up neighborhood search in local Gaussian process prediction. Technometrics, 58(3), pp. 294-303; preprint on arXiv:1409.0074 http://arxiv.org/abs/1409.0074
vignette("laGP")
,
laGP
, alcGP
, mspeGP
, alcrayGP
,
makeCluster
, clusterApply
# NOT RUN { ## first, a "computer experiment" ## Simple 2-d test function used in Gramacy & Apley (2014); ## thanks to Lee, Gramacy, Taddy, and others who have used it before f2d <- function(x, y=NULL) { if(is.null(y)){ if(!is.matrix(x) && !is.data.frame(x)) x <- matrix(x, ncol=2) y <- x[,2]; x <- x[,1] } g <- function(z) return(exp(-(z-1)^2) + exp(-0.8*(z+1)^2) - 0.05*sin(8*(z+0.1))) z <- -g(x)*g(y) } ## build up a design with N=~40K locations x <- seq(-2, 2, by=0.02) X <- expand.grid(x, x) Z <- f2d(X) ## predictive grid with NN=400 locations, ## change NN to 10K (length=100) to mimic setup in Gramacy & Apley (2014) ## the low NN set here is for fast CRAN checks xx <- seq(-1.975, 1.975, length=10) XX <- expand.grid(xx, xx) ZZ <- f2d(XX) ## get the predictive equations, first based on Nearest Neighbor out <- aGP(X, Z, XX, method="nn", verb=0) ## RMSE sqrt(mean((out$mean - ZZ)^2)) # } # NOT RUN { ## refine with ALC out2 <- aGP(X, Z, XX, method="alc", d=out$mle$d) ## RMSE sqrt(mean((out2$mean - ZZ)^2)) ## visualize the results par(mfrow=c(1,3)) image(xx, xx, matrix(out2$mean, nrow=length(xx)), col=heat.colors(128), xlab="x1", ylab="x2", main="predictive mean") image(xx, xx, matrix(out2$mean-ZZ, nrow=length(xx)), col=heat.colors(128), xlab="x1", ylab="x2", main="bias") image(xx, xx, matrix(sqrt(out2$var), nrow=length(xx)), col=heat.colors(128), xlab="x1", ylab="x2", main="sd") ## refine with MSPE out3 <- aGP(X, Z, XX, method="mspe", d=out2$mle$d) ## RMSE sqrt(mean((out3$mean - ZZ)^2)) # } # NOT RUN { ## version with ALC-ray which is much faster than the ones not ## run above out2r <- aGP(X, Z, XX, method="alcray", d=out$mle$d, verb=0) sqrt(mean((out2r$mean - ZZ)^2)) ## a simple example with estimated nugget library(MASS) ## motorcycle data and predictive locations X <- matrix(mcycle[,1], ncol=1) Z <- mcycle[,2] XX <- matrix(seq(min(X), max(X), length=100), ncol=1) ## first stage out <- aGP(X=X, Z=Z, XX=XX, end=30, g=list(mle=TRUE), verb=0) ## plot smoothed versions of the estimated parameters par(mfrow=c(2,1)) df <- data.frame(y=log(out$mle$d), XX) lo <- loess(y~., data=df, span=0.25) plot(XX, log(out$mle$d), type="l") lines(XX, lo$fitted, col=2) dfnug <- data.frame(y=log(out$mle$g), XX) lonug <- loess(y~., data=dfnug, span=0.25) plot(XX, log(out$mle$g), type="l") lines(XX, lonug$fitted, col=2) ## second stage design out2 <- aGP(X=X, Z=Z, XX=XX, end=30, verb=0, d=list(start=exp(lo$fitted), mle=FALSE), g=list(start=exp(lonug$fitted))) ## plot the estimated surface par(mfrow=c(1,1)) plot(X,Z) df <- 20 s2 <- out2$var*(df-2)/df q1 <- qt(0.05, df)*sqrt(s2) + out2$mean q2 <- qt(0.95, df)*sqrt(s2) + out2$mean lines(XX, out2$mean) lines(XX, q1, col=1, lty=2) lines(XX, q2, col=1, lty=2) ## compare to the single-GP result provided in the mleGP documentation # }