Defines an objective function for performing blackbox optimization towards solving a modularized calibration of large computer model simulation to field data

```
fcalib(u, XU, Z, X, Y, da, d, g, uprior = NULL, methods = rep("alc", 2),
M = NULL, bias = TRUE, omp.threads = 1, save.global = FALSE, verb = 1)
```

u

a vector of length `ncol(XU) - ncol(X)`

containing a setting
of the calibration parameter

XU

a `matrix`

or `data.frame`

containing
the full (large) design matrix of input locations to a computer
simulator whose final `ncol(XU) - ncol(X)`

columns
contain settings of a calibration or tuning parameter like
`u`

Z

a vector of responses/dependent values with `length(Z) = ncol(XU)`

of computer model outputs at `XU`

X

a `matrix`

or `data.frame`

containing
the full (large) design matrix of input locations

Y

a vector of values with `length(Y) = ncol(X)`

containing the response from field data observations
at `X`

. A `Y`

-vector with `length(Y) = k*ncol(X)`

,
for positive integer `k`

, can be supplied in which case
the multiple `Y`

-values will be treated as replicates
at the `X`

-values

da

for emulating `Z`

at `XU`

:
a prior or initial setting for the (single/isotropic) lengthscale
parameter in a Gaussian correlation function; a (default)
`NULL`

value triggers a sensible regularization (prior) and
initial setting to be generated via `darg`

;
a scalar specifies an initial value, causing `darg`

to only generate the prior; otherwise,
a list or partial list matching the output
of `darg`

can be used to specify a custom prior. In the
case of a partial list, the only the missing entries will be
generated. Note that a default/generated list specifies MLE/MAP
inference for this parameter. When specifying initial values, a
vector of length `nrow(XX)`

can be provided, giving a
different initial value for each predictive location

d

for the discrepancy between emulations `Yhat`

at `X`

, based
on `Z`

at `XU`

, and the oputs `Y`

observed at `X`

.
Otherwise, same description as `da`

above

g

for the nugget in the GP model for the discrepancy between emulation
`Yhat`

at `X`

, based
on `Z`

at `XU`

, and the outputs `Y`

observed at `X`

:
a prior or initial setting for the nugget parameter; a
`NULL`

value causes a sensible regularization (prior) and
initial setting to be generated via `garg`

; a scalar (default
`g = 1/1000`

) specifies an initial value, causing `garg`

to only generate the prior; otherwise, a list or partial list matching the
output of `garg`

can be used to specify a custom prior. In
the case of a partial list, only the missing entries will be
generated. Note that a default/generated list specifies *no*
inference for this parameter; i.e., it is fixed at its starting value,
which may be appropriate for emulating deterministic computer code output.
At this time, estimating a nugget for the computer model emulator is not
supported by `fcalib`

uprior

an optional function taking `u`

arguments which returns a log
prior density value for the calibration parameter.

methods

M

a computer model “simulation” function taking two matrices as
inputs, to be used in lieu of emulation; see `aGP.seq`

for mode details

bias

a scalar logical indicating whether a GP discrepancy or bias term should
be estimated via `discrep.est`

, as opposed to
only a Gaussian (zero-mean) variance;
see `discrep.est`

for more details

omp.threads

a scalar positive integer indicating the number
of threads to use for SMP parallel processing; see `aGP`

for more details

save.global

an environment, e.g., `.GlobalEnv`

if each evaluation of `fcalib`

, say as
called by a wrapper or optimization routine, should be saved. The variable
used in that environment will be `fcalib.save`

. Otherwise `save.global = FALSE`

will skip saving the information

verb

a non-negative integer specifying the verbosity level; `verb = 0`

is quiet, whereas a larger value causes each evaluation to be printed
to the screen

Returns a scalar measuring the negative log likelihood or posterior density
of the calibration parameter `u`

given the other inputs, for
the purpose of optimization over `u`

Gramacy, et al. (2015) defined an objective function which, when optimized,
returns a setting of calibration parameters under a setup akin to
the modularized calibration method of Liu, et al., (2009). The `fcalib`

function returns a log density (likelihood or posterior probability) value
obtained by performing emulation at a set of inputs `X`

augmented
with a value of the calibration parameter, `u`

. The emulator
is trained on `XU`

and `Z`

, presumed to be very large
relative to the size of the field data set `X`

and `Y`

,
necessitating the use of approximate methods like `aGP`

,
via `aGP.seq`

. The
emulated values, call them `Yhat`

are fed along with `X`

and
`Y`

into the `discrep.est`

function, whose
likelihood or posterior calculation serves as a measure of merit for
the value `u`

.

The `fcalib`

function is deterministic but, as Gramacy, et al. (2015)
described, can result is a rugged objective surface for optimizing,
meaning that conventional methods, like those in `optim`

are unlikely to work well. They instead recommend using a blackbox
derivative-free method, like NOMAD (Le Digabel, 2011). In our example
below we use the implementation in the crs package, which provides
an R wrapper around the underlying C library.

Note that while `fcalib`

automates a call first to `aGP.seq`

and then to `discrep.est`

, it does not return enough
information to complete, say, an out-of-sample prediction exercise like
the one demonstrated in the `discrep.est`

documentation.
Therefore, after `fcalib`

is used in an optimization to
find the best setting of the calibration parameter, `u`

,
those functions must then be used in post-processing to complete a
prediction exercise. See `demo("calib")`

or `vignette("laGP")`

for more details

R.B. Gramacy (2016). *laGP: Large-Scale Spatial Modeling via
Local Approximate Gaussian Processes in R.*, Journal of Statistical
Software, 72(1), 1-46; or see `vignette("laGP")`

R.B. Gramacy, D. Bingham, JP. Holloway, M.J. Grosskopf, C.C. Kuranz, E. Rutter,
M. Trantham, and P.R. Drake (2015). *Calibrating a large computer
experiment simulating radiative shock hydrodynamics.*
Annals of Applied Statistics, 9(3) 1141-1168; preprint on arXiv:1410.3293
http://arxiv.org/abs/1410.3293

F. Liu, M. Bayarri, and J. Berger (2009).
*Modularization in Bayesian analysis, with emphasis on analysis of computer models.*
Bayesian Analysis, 4(1) 119-150.

S. Le Digabel (2011).
*Algorithm 909: NOMAD: Nonlinear Optimization with the MADS algorithm*.
ACM Transactions on Mathematical Software, 37, 4, 44:1-44:15.

J.S. Racine, Z. and Nie (2012). crs:
*Categorical regression splines*. R package version 0.15-18.

`vignette("laGP")`

,
`jmleGP`

, `newGP`

, `aGP.seq`

, `discrep.est`

,
`snomadr`

# NOT RUN { ## the example here illustrates how fcalib combines aGP.seq and ## discrep.est functions, duplicating the example in the discrep.est ## documentation file. It is comprised of snippets from demo("calib"), ## which contains code from the Calibration Section of vignette("laGP") ## Here we generate calibration data using a true calibration ## parameter, u, and then evaluate log posterior probabilities; ## the discrep.est documentation repeats this with by first calling ## aGP.seq and then discrep.est. The answers should be identical, however ## note that a call first to example("fcalib") and then ## example("discrep.est") will generate two random data sets, causing ## the results not to match ## begin data-generation code identical to aGP.seq, discrep.est, fcalib ## example sections and demo("calib") ## M: computer model test function used in Goh et al, 2013 (Technometrics) ## an elaboration of one from Bastos and O'Hagan, 2009 (Technometrics) M <- function(x,u) { x <- as.matrix(x) u <- as.matrix(u) out <- (1-exp(-1/(2*x[,2]))) out <- out * (1000*u[,1]*x[,1]^3+1900*x[,1]^2+2092*x[,1]+60) out <- out / (100*u[,2]*x[,1]^3+500*x[,1]^2+4*x[,1]+20) return(out) } ## bias: discrepancy function from Goh et al, 2013 bias <- function(x) { x<-as.matrix(x) out<- 2*(10*x[,1]^2+4*x[,2]^2) / (50*x[,1]*x[,2]+10) return(out) } ## beta.prior: marginal beta prior for u, ## defaults to a mode at 1/2 in hypercube beta.prior <- function(u, a=2, b=2, log=TRUE) { if(length(a) == 1) a <- rep(a, length(u)) else if(length(a) != length(u)) stop("length(a) must be 1 or length(u)") if(length(b) == 1) b <- rep(b, length(u)) else if(length(b) != length(u)) stop("length(b) must be 1 or length(u)") if(log) return(sum(dbeta(u, a, b, log=TRUE))) else return(prod(dbeta(u, a, b, log=FALSE))) } ## tgp for LHS sampling library(tgp) rect <- matrix(rep(0:1, 4), ncol=2, byrow=2) ## training inputs ny <- 50; X <- lhs(ny, rect[1:2,]) ## computer model train ## true (but unknown) setting of the calibration parameter ## for the computer model u <- c(0.2, 0.1) Zu <- M(X, matrix(u, nrow=1)) ## field data response, biased and replicated sd <- 0.5 ## Y <- computer output + bias + noise reps <- 2 ## example from paper uses reps <- 10 Y <- rep(Zu,reps) + rep(bias(X),reps) + rnorm(reps*length(Zu), sd=sd) ## variations: remove the bias or change 2 to 1 to remove replicates ## computer model design nz <- 10000 XU <- lhs(nz, rect) nth <- 1 ## number of threads to use in emulation, demo uses 8 ## augment with physical model design points ## with various u settings XU2 <- matrix(NA, nrow=10*ny, ncol=4) for(i in 1:10) { I <- ((i-1)*ny+1):(ny*i) XU2[I,1:2] <- X } XU2[,3:4] <- lhs(10*ny, rect[3:4,]) XU <- rbind(XU, XU2) ## evaluate the computer model Z <- M(XU[,1:2], XU[,3:4]) ## flag indicating if estimating bias/discrepancy or not bias.est <- TRUE ## two passes of ALC with MLE calculations for aGP.seq methods <- rep("alcray", 2) ## demo uses rep("alc", 2) ## set up priors da <- d <- darg(NULL, XU) g <- garg(list(mle=TRUE), Y) ## end identical data generation code ## now calculate log posterior for true calibration parameter ## value (u). You could repeat this for an estimate value ## from demo("calib"), for example u.hat <- c(0.8236673, 0.1406989) fcalib(u, XU, Z, X, Y, da, d, g, beta.prior, methods, M, bias.est, nth) # }