fit_one_layer: MCMC sampling for one layer GP

Description

Conducts MCMC sampling of hyperparameters for a one layer GP. Length scale parameter theta governs the strength of the correlation and nugget parameter g governs noise. In Matern covariance, v governs smoothness.

Usage

fit_one_layer(
  x,
  y,
  dydx = NULL,
  nmcmc = 10000,
  sep = FALSE,
  verb = TRUE,
  theta_0 = 0.01,
  g_0 = 0.001,
  true_g = NULL,
  v = 2.5,
  settings = NULL,
  cov = c("matern", "exp2"),
  vecchia = FALSE,
  m = NULL,
  ord = NULL,
  cores = NULL
)

Value

a list of the S3 class gp or gpvec with elements:

x: copy of input matrix
y: copy of response vector
nmcmc: number of MCMC iterations
settings: copy of proposal/prior settings
v: copy of Matern smoothness parameter (v = 999 indicates cov = "exp2")
dydx: copy of dydx (if not NULL)
grad_indx: stacked partial derivative indices (only if dydx is provided)
g: vector of MCMC samples for g
theta: vector of MCMC samples for theta
tau2: vector of MLE estimates for tau2 (scale parameter)
x_approx: Vecchia approximation object (vecchia = TRUE only)
ll: vector of MVN log likelihood for each Gibbs iteration
time: computation time in seconds

Arguments

x: vector or matrix of input locations
y: vector of response values
dydx: optional matrix of observed gradients, rows correspond to x locations, columns contain partial derivatives with respect to that input dimension (dim(dy) must match dim(x))
nmcmc: number of MCMC iterations
sep: logical indicating whether to use separable (sep = TRUE) or isotropic (sep = FALSE) lengthscales
verb: logical indicating whether to print iteration progress
theta_0: initial value for theta
g_0: initial value for g (only used if true_g = NULL)
true_g: if true nugget is known it may be specified here (set to a small value to make fit deterministic). Note - values that are too small may cause numerical issues in matrix inversions.
v: Matern smoothness parameter (only used if cov = "matern")
settings: hyperparameters for proposals and priors (see details)
cov: covariance kernel, either Matern ("matern") or squared exponential ("exp2")
vecchia: logical indicating whether to use Vecchia approximation
m: size of Vecchia conditioning sets, defaults to the lower of 25 or the maximum available (only used if vecchia = TRUE)
ord: optional ordering for Vecchia approximation, must correspond to rows of x, defaults to random
cores: number of cores to use for OpenMP parallelization (vecchia = TRUE only). Defaults to min(4, maxcores - 1) where maxcores is the number of detectable available cores.

Details

Utilizes Metropolis Hastings sampling of the length scale and nugget parameters with proposals and priors controlled by settings. When true_g is set to a specific value, the nugget is not estimated. When vecchia = TRUE, all calculations leverage the Vecchia approximation with specified conditioning set size m.

NOTE on OpenMP: The Vecchia implementation relies on OpenMP parallelization for efficient computation. This function will produce a warning message if the package was installed without OpenMP (this is the default for CRAN packages installed on Apple machines). To set up OpenMP parallelization, download the package source code and install using the gcc/g++ compiler.

Proposals for g and theta follow a uniform sliding window scheme, e.g.,

g_star <- runif(1, l * g_t / u, u * g_t / l),

with defaults l = 1 and u = 2 provided in settings. To adjust these, set settings = list(l = new_l, u = new_u).

Priors on g and theta follow Gamma distributions with shape parameters (alpha) and rate parameters (beta) controlled within the settings list object. Default priors differ for noisy/deterministic settings. All default values are visible in the internal deepgp:::check_settings function. These priors are designed for x scaled to [0, 1] and y scaled to have mean 0 and variance 1. These may be adjusted using the settings input.

The output object of class gp is designed for use with continue, trim, plot, and predict.

References

Sauer, A. (2023). Deep Gaussian process surrogates for computer experiments. *Ph.D. Dissertation, Department of Statistics, Virginia Polytechnic Institute and State University.*

Sauer, A., Gramacy, R.B., & Higdon, D. (2023). Active learning for deep Gaussian process surrogates. *Technometrics, 65,* 4-18. arXiv:2012.08015

Booth, A. S. (2025). Deep Gaussian processes with gradients. arXiv:2512.18066

Sauer, A., Cooper, A., & Gramacy, R. B. (2023). Vecchia-approximated deep Gaussian processes for computer experiments. *Journal of Computational and Graphical Statistics, 32*(3), 824-837. arXiv:2204.02904

Examples

Run this code

# Additional examples including real-world computer experiments are available at: 
# https://bitbucket.org/gramacylab/deepgp-ex/
# \donttest{
# Booth function (inspired by the Higdon function)
f <- function(x) {
  i <- which(x <= 0.58)
  x[i] <- sin(pi * x[i] * 6) + cos(pi * x[i] * 12)
  x[-i] <- 5 * x[-i] - 4.9
  return(x)
}

# Training data
x <- seq(0, 1, length = 25)
y <- f(x)

# Testing data
xx <- seq(0, 1, length = 100)
yy <- f(xx)

plot(xx, yy, type = "l")
points(x, y, col = 2)

# Example 1: nugget fixed, calculating EI
fit <- fit_one_layer(x, y, nmcmc = 2000, true_g = 1e-6)
plot(fit)
fit <- trim(fit, 1000, 2)
fit <- predict(fit, xx, cores = 1, EI = TRUE)
plot(fit)
par(new = TRUE) # overlay EI
plot(xx[order(xx)], fit$EI[order(xx)], type = 'l', lty = 2, 
   axes = FALSE, xlab = '', ylab = '')

# Example 2: convert fit to Vecchia object before predicting
# (this is faster if the training data set is large)
fit <- to_vec(fit)
fit <- predict(fit, xx, cores = 1)
plot(fit)

# Example 3: using Vecchia for training and testing
fit <- fit_one_layer(x, y, nmcmc = 2000, true_g = 1e-6, vecchia = TRUE, m = 10)
plot(fit)
fit <- trim(fit, 1000, 2)
fit <- predict(fit, xx, cores = 1)
plot(fit)
# }

Run the code above in your browser using DataLab