# btgp

0th

Percentile

##### Bayesian Nonparametric \& Nonstationary Regression Models

The seven functions described below implement Bayesian regression models of varying complexity: linear model, linear CART, Gaussian process (GP), GP with jumps to the limiting linear model (LLM), treed GP, and treed GP LLM.

Keywords
regression, nonparametric, smooth, tree, models, spatial , nonlinear , optimize
##### Usage
blm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
BTE = c(1000, 4000, 3), R = 1, m0r1 = TRUE, itemps = NULL,
pred.n = TRUE, krige = TRUE, zcov = FALSE, Ds2x = FALSE,
improv = FALSE, sens.p = NULL, trace = FALSE, verb = 1, ...)
btlm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
tree = c(0.5, 2), BTE = c(2000, 7000, 2), R = 1, m0r1 = TRUE,
itemps = NULL, pred.n = TRUE, krige = TRUE, zcov = FALSE,
Ds2x = FALSE, improv = FALSE, sens.p = NULL, trace = FALSE,
verb = 1, ...)
bcart(X, Z, XX = NULL, bprior = "bflat", tree = c(0.5, 2),
BTE = c(2000, 7000, 2), R = 1, m0r1 = TRUE, itemps = NULL,
pred.n = TRUE, krige = TRUE, zcov = FALSE, Ds2x = FALSE,
improv=FALSE, sens.p = NULL, trace = FALSE, verb = 1, ...)
bgp(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", BTE = c(1000, 4000, 2), R = 1, m0r1 = TRUE,
itemps = NULL, pred.n = TRUE, krige = TRUE, zcov = FALSE,
Ds2x = FALSE, improv = FALSE, sens.p = NULL, nu = 1.5,
trace = FALSE, verb = 1, ...)
bgpllm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", gamma=c(10,0.2,0.7), BTE = c(1000, 4000, 2),
R = 1, m0r1 = TRUE, itemps = NULL, pred.n = TRUE,
krige = TRUE, zcov = FALSE, Ds2x = FALSE, improv = FALSE,
sens.p = NULL, nu = 1.5, trace = FALSE, verb = 1, ...)
btgp(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", tree = c(0.5, 2), BTE = c(2000, 7000, 2),
R = 1, m0r1 = TRUE, linburn = FALSE, itemps = NULL,
pred.n = TRUE, krige = TRUE, zcov = FALSE, Ds2x = FALSE,
improv = FALSE, sens.p = NULL, nu = 1.5, trace = FALSE,
verb = 1, ...)
btgpllm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", tree = c(0.5, 2), gamma=c(10,0.2,0.7),
BTE = c(2000, 7000, 2), R = 1, m0r1 = TRUE, linburn = FALSE,
itemps = NULL, pred.n = TRUE, krige = TRUE, zcov = FALSE,
Ds2x = FALSE, improv = FALSE, sens.p = NULL, nu = 1.5,
trace = FALSE, verb = 1, ...)
##### Arguments
X

data.frame, matrix, or vector of inputs X

Z

Vector of output responses Z of length equal to the leading dimension (rows) of X, i.e., length(Z) == nrow(X)

XX

Optional data.frame, matrix, or vector of predictive input locations with the same number of columns as X, i.e., ncol(XX) == ncol(X)

meanfn

A choice of mean function for the process. When meanfn = "linear" (default), then we have the process $$Z = (\mathbf{1} \;\; \mathbf{X}) \mbox{\boldmath \beta} + W(\mathbf{X})$$ where $$W(\mathbf{X})$$ represents the Gaussian process part of the model (if present). Otherwise, when meanfn = "constant", then $$Z = \beta_0 + W(\mathbf{X})$$

bprior

Linear (beta) prior, default is "bflat"; alternates include "b0" hierarchical Normal prior, "bmle" empirical Bayes Normal prior, "b0not" Bayesian treed LM-style prior from Chipman et al. (same as "b0" but without tau2), "bmzt" a independent Normal prior (mean zero) with inverse-gamma variance (tau2), and "bmznot" is the same as "bmznot" without tau2. The default "bflat" gives an “improper” prior which can perform badly when the signal-to-noise ratio is low. In these cases the “proper” hierarchical specification "b0" or independent "bmzt" or "bmznot" priors may perform better

tree

a 2-vector containing the tree process prior parameterization c(alpha, beta) specifying $$p_{\mbox{\tiny split}}(\eta, \mathcal{T}) = \alpha*(1+\eta)^\beta$$ automatically giving zero probability to trees with partitions containing less than min(c(10,nrow(X)+1)) data points. You may also specify a longer vector, writing over more of the components of the $tree output from tgp.default.params gamma Limiting linear model parameters c(g, t1, t2), with growth parameter g > 0 minimum parameter t1 >= 0 and maximum parameter t1 >= 0, where t1 + t2 <= 1 specifies $$p(b|d)=t_1 +\exp\left\{\frac{-g(t_2-t_1)}{d-0.5}\right\}$$ corr Gaussian process correlation model. Choose between the isotropic power exponential family ("exp") or the separable power exponential family ("expsep", default); the current version also supports the isotropic Matern ("matern") and single-index Model ("sim") as “beta” functionality. BTE 3-vector of Monte-carlo parameters (B)urn in, (T)otal, and (E)very. Predictive samples are saved every E MCMC rounds starting at round B, stopping at T. R Number of repeats or restarts of BTE MCMC rounds, default R=1 is no restarts m0r1 If TRUE (default) the responses Z will be scaled to have a mean of zero and a range of 1 linburn If TRUE initializes MCMC with B (additional) rounds of Bayesian Linear CART (btlm); default is FALSE itemps Importance tempering (IT) inverse temperature ladder, or powers to improve mixing. See default.itemps. The default is no IT itemps = NULL pred.n TRUE (default) value results in prediction at the inputs X; FALSE skips prediction at X resulting in a faster implementation krige TRUE (default) value results in collection of kriging means and variances at predictive (and/or data) locations; FALSE skips the gathering of kriging statistics giving a savings in storage zcov If TRUE then the predictive covariance matrix is calculated-- can be computationally (and memory) intensive if X or XX is large. Otherwise only the variances (diagonal of covariance matrices) are calculated (default). See outputs Zp.s2, ZZ.s2, etc., below Ds2x TRUE results in ALC (Active Learning--Cohn) computation of expected reduction in uncertainty calculations at the XX locations, which can be used for adaptive sampling; FALSE (default) skips this computation, resulting in a faster implementation improv TRUE results in samples from the improvement at locations XX with respect to the observed data minimum. These samples are used to calculate the expected improvement over XX, as well as to rank all of the points in XX in the order that they should be sampled to minimize the expected multivariate improvement (refer to Schonlau et al, 1998). Alternatively, improv can be set to any positive integer 'g', in which case the ranking is performed with respect to the expectation for improvement raised to the power 'g'. Increasing 'g' leads to rankings that are more oriented towards a global optimization. The option FALSE (default) skips these computations, resulting in a faster implementation. Optionally, a two-vector can be supplied where improv is interpreted as the (maximum) number of points to rank by improvement. See the note below. If not specified, the entire XX matrix is ranked. sens.p Either NULL or a vector of parameters for sensitivity analysis, built by the function sens. Refer there for details nu “beta” functionality: fixed smoothness parameter for the Matern correlation function; nu + 0.5 times differentiable predictive surfaces result trace TRUE results in a saving of samples from the posterior distribution for most of the parameters in the model. The default is FALSE for speed/storage reasons. See note below verb Level of verbosity of R-console print statements: from 0 (none); 1 (default) which shows the “progress meter”; 2 includes an echo of initialization parameters; up to 3 and 4 (max) with more info about successful tree operations ... These ellipses arguments are interpreted as augmentations to the prior specification generated by params <- tgp.default.params(ncol(X)+1). You may use these to specify a custom setting of any of default parameters in the output list params except those for which a specific argument is already provided (e.g., params$corr or params$bprior) or those which contradict the type of b* function being called (e.g., params$tree or params$gamma); these redundant or possibly conflicting specifications will be ignored. Refer to tgp.default.params for details on the prior specification ##### Details The functions and their arguments can be categorized by whether or not they use treed partitioning (T), GP models, and jumps to the LLM (or LM)  blm LM Linear Model btlm T, LM Treed Linear Model bcart T Treed Constant Model bgp GP GP Regression bgpllm GP, LLM GP with jumps to the LLM btgp T, GP treed GP Regression Each function implements a special case of the generic function tgp which is an interface to C/C++ code for treed Gaussian process modeling of varying parameterization. Documentation for tgp has been declared redundant, and has subsequently been removed. To see how the b* functions use tgp simply examine the function. In the latest version, with the addition of the ellipses “...” argument, there is nothing that can be done with the direct tgp function that cannot also be done with a b* function Only functions in the T (tree) category take the tree argument; GP category functions take the corr argument; and LLM category functions take the gamma argument. Non-tree class functions omit the parts output, see below bcart is the same as btlm except that only the intercept term in the LM is estimated; the others are zero, thereby implementing a Bayesian version of the original CART model The sens.p argument contains a vector of parameters for sensitivity analysis. It should be NULL unless created by the sens function. Refer to help(sens) for details. If itemps =! NULL then importance tempering (IT) is performed to get better mixing. After each restart (when R > 1) the observation counts are used to update the pseudo-prior. Stochastic approximation is performed in the first burn-in rounds (for B-T rounds, not B) when c0 and n0 are positive. Every subsequent burn-in after the first restart is for B rounds in order to settle-in after using the observation counts. See default.itemps for more details and an example Please see vignette("tgp") for a detailed illustration ##### Value bgp returns an object of class "tgp". The function plot.tgp can be used to help visualize results. An object of class "tgp" is a list containing at least the following components... The parts output is unique to the T (tree) category functions. Tree viewing is supported by tgp.trees X Input argument: data.frame of inputs X n Number of rows in X, i.e., nrow(X) d Number of cols in X, i.e., ncol(X) Z Vector of output responses Z XX Input argument: data.frame of predictive locations XX nn Number of rows in XX, i.e., nrow(XX) BTE Input argument: Monte-carlo parameters R Input argument: restarts linburn Input argument: initialize MCMC with linear CART params list of model parameters generated by tgp.default.params and subsequently modified according to the calling b* function and its arguments dparams Double-representation of model input parameters used by the C-code itemps data.frame containing the importance tempering ladders and pseudo-prior: $k has inverse inverse temperatures (from the input argument), $k has an updated pseudo-prior based on observation counts and (possibly) stochastic approximation during burn-in and (input) stochastic approximation parameters $$c_0$$ and $$n_0$$. See default.itemps for more info Zp.mean Vector of mean predictive estimates at X locations Zp.q1 Vector of 5% predictive quantiles at X locations Zp.q2 Vector of 95% predictive quantiles at X locations Zp.q Vector of quantile norms Zp.q2-Zp.q1 Zp.s2 If input zcov = TRUE, then this is a predictive covariance matrix for the inputs at locations X; otherwise then this is a vector of predictive variances at the X locations (diagonal of the predictive covariance matrix). Only appears when input pred.n = TRUE Zp.km Vector of (expected) kriging means at X locations Zp.vark Vector of posterior variance for kriging surface (no additive noise) at X locations Zp.ks2 Vector of (expected) predictive kriging variances at X locations ZZ.mean Vector of mean predictive estimates at XX locations ZZ.q1 Vector of 5% predictive quantiles at XX locations ZZ.q2 Vector of 95% predictive quantiles at XX locations ZZ.q Vector of quantile norms ZZ.q2-ZZ.q1, used by the ALM adaptive sampling algorithm ZZ.s2 If input zcov = TRUE, then this is a predictive covariance matrix for predictive locations XX; otherwise then this is a vector of predictive variances at the XX locations (diagonal of the predictive covariance matrix). Only appears when input XX != NULL ZpZZ.s2 If input zcov = TRUE, then this is a predictive n * nn covariance matrix between locations in X and XX; Only appears when zcov = TRUE and both pred.n = TRUE and XX != NULL ZZ.km Vector of (expected) kriging means at XX locations ZZ.vark Vector of posterior variance for kriging surface (no additive noise) at XX locations ZZ.ks2 Vector of (expected) predictive kriging variances at XX locations Ds2x If argument Ds2x=TRUE, this vector contains ALC statistics for XX locations improv If argument improv is TRUE or a positive integer, this is a 'matrix' with first column set to the expected improvement statistics for XX locations, and the second column set to a ranking in the order that they should be sampled to minimize the expected multivariate improvement raised to a power determined by the argument improv response Name of response Z if supplied by data.frame in argument, or "z" if none provided parts Internal representation of the regions depicted by partitions of the maximum a' posteriori (MAP) tree trees list of trees (maptree representation) which were MAP as a function of each tree height sampled between MCMC rounds B and T trace If trace==TRUE, this list contains traces of most of the model parameters and posterior predictive distributions at input locations XX. Otherwise the entry is FALSE. See note below ess Importance tempering effective sample size (ESS). If itemps==NULL this corresponds to the total number of samples collected, i.e.. R*(BTE-BTE)/BTE. Otherwise the ESS will be lower due to a non-zero coefficient of variation of the calculated importance tempering weights sens See sens documentation for more details ##### Note Inputs X, XX, Z containing NaN, NA, or Inf are discarded with non-fatal warnings Upon execution, MCMC reports are made every 1,000 rounds to indicate progress Stationary (non-treed) processes on larger inputs (e.g., X,Z) of size greater than 500, *might* be slow in execution, especially on older machines. Once the C code starts executing, it can be interrupted in the usual way: either via Ctrl-C (Unix-alikes) or pressing the Stop button in the R-GUI. When this happens, interrupt messages will indicate which required cleanup measures completed before returning control to R. Whereas most of the tgp models will work reasonably well with little or no change to the default prior specification, GP's with the "mrexpsep" correlation imply a very specific relationship between fine and coarse data, and a careful prior specification is usually required. The ranks provided in the second column of the improv field of a tgp object are based on the expectation of a multivariate improvement that may or may not be raised to a positive integer power. They can thus differ significantly from a simple ranking of the first column of expected univariate improvement values. Regarding trace=TRUE: Samples from the posterior will be collected for all parameters in the model. GP parameters are collected with reference to the locations in XX, resulting nn=nrow{XX} traces of d,g,s2,tau2, etc. Therefore, it is recommended that nn is chosen to be a small, representative, set of input locations. Besides GP parameters, traces are saved for the tree partitions, areas under the LLM, log posterior (as a function of tree height), and samples from the posterior predictive distributions. Note that since some traces are stored in files, multiple tgp/R sessions in the same working directory can clobber the trace files of other sessions ##### References Gramacy, R. B. (2020) Surrogates: Gaussian Process Modeling, Design and Optimization for the Applied Sciences. Boca Raton, Florida: Chapman Hall/CRC. (See Chapter 9.) https://bobby.gramacy.com/surrogates/ Gramacy, R. B. (2008). tgp: An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models. Journal of Statistical Software, 19(9). https://www.jstatsoft.org/v19/i09 Robert B. Gramacy, Matthew Taddy (2010). Categorical Inputs, Sensitivity Analysis, Optimization and Importance Tempering with tgp Version 2, an R Package for Treed Gaussian Process Models. Journal of Statistical Software, 33(6), 1--48. https://www.jstatsoft.org/v33/i06/. Gramacy, R. B., Lee, H. K. H. (2007). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103(483), pp. 1119-1130. Also available as ArXiv article 0710.4536 https://arxiv.org/abs/0710.4536 Gramacy, R. B. and Lee, K.H. (2008). Gaussian Processes and Limiting Linear Models. Computational Statistics and Data Analysis, 53, pp. 123-136. Also available as ArXiv article 0804.4685 https://arxiv.org/abs/0804.4685 Gramacy, R. B., Lee, H. K. H. (2009). Adaptive design and analysis of supercomputer experiments. Technometrics, 51(2), pp. 130-145. Also avaliable on ArXiv article 0805.4359 https://arxiv.org/abs/0805.4359 Robert B. Gramacy, Heng Lian (2011). Gaussian process single-index models as emulators for computer experiments. Available as ArXiv article 1009.4241 https://arxiv.org/abs/1009.4241 Chipman, H., George, E., \& McCulloch, R. (1998). Bayesian CART model search (with discussion). Journal of the American Statistical Association, 93, 935--960. Chipman, H., George, E., \& McCulloch, R. (2002). Bayesian treed models. Machine Learning, 48, 303--324. M. Schonlau and Jones, D.R. and Welch, W.J. (1998). Global versus local search in constrained optimization of computer models. In "New Developments and applications in experimental design", IMS Lecture Notes - Monograph Series 34. 11--25. https://bobby.gramacy.com/r_packages/tgp/ ##### See Also plot.tgp, tgp.trees, predict.tgp, sens, default.itemps ##### Aliases • blm • btlm • bcart • bgp • bgpllm • btgp • btgpllm ##### Examples # NOT RUN { ## ## Many of the examples below illustrate the above ## function(s) on random data. Thus it can be fun ## (and informative) to run them several times. ## # # simple linear response # # input and predictive data X <- seq(0,1,length=50) XX <- seq(0,1,length=99) Z <- 1 + 2*X + rnorm(length(X),sd=0.25) out <- blm(X=X, Z=Z, XX=XX) # try Linear Model plot(out) # plot the surface # # 1-d Example # # construct some 1-d nonstationary data X <- seq(0,20,length=100) XX <- seq(0,20,length=99) Z <- (sin(pi*X/5) + 0.2*cos(4*pi*X/5)) * (X <= 9.6) lin <- X>9.6; Z[lin] <- -1 + X[lin]/10 Z <- Z + rnorm(length(Z), sd=0.1) out <- btlm(X=X, Z=Z, XX=XX) # try Linear CART plot(out) # plot the surface tgp.trees(out) # plot the MAP trees out <- btgp(X=X, Z=Z, XX=XX) # use a treed GP plot(out) # plot the surface tgp.trees(out) # plot the MAP trees # # 2-d example # (using the isotropic correlation function) # # construct some 2-d nonstationary data exp2d.data <- exp2d.rand() X <- exp2d.data$X; Z <- exp2d.data$Z XX <- exp2d.data$XX

# try a GP
out <- bgp(X=X, Z=Z, XX=XX, corr="exp")
plot(out) 			# plot the surface

# try a treed GP LLM
out <- btgpllm(X=X, Z=Z, XX=XX, corr="exp")
plot(out) 			# plot the surface
tgp.trees(out) 		 	# plot the MAP trees

#
# Motorcycle Accident Data
#

# get the data
require(MASS)

# try a GP
out <- bgp(X=mcycle[,1], Z=mcycle[,2])
plot(out)			# plot the surface

# try a treed GP LLM
# best to use the "b0" beta linear prior to capture common
# common linear process throughout all regions (using the
# ellipses "...")
out <- btgpllm(X=mcycle[,1], Z=mcycle[,2], bprior="b0")
plot(out)			# plot the surface
tgp.trees(out)		 	# plot the MAP trees
# }

Documentation reproduced from package tgp, version 2.4-17, License: LGPL

### Community examples

Looks like there are no examples yet.