btgp: One of Six Bayesian Nonparametric & Nonstationary Regression Models

Description

The six functions described below implement Bayesian regression models of varying complexity: linear model, linear CART, Gaussian process (GP), GP with jumps to the limiting linear model (LLM), treed GP, and treed GP LLM. They are provided as a streamlined interface to the tgp function of which each of the functions herein represents a special case

Usage

blm(X, Z, XX = NULL, bprior = "bflat", BTE = c(1000, 4000, 3), 
	R = 1, m0r1 = FALSE, pred.n = TRUE, ds2x = FALSE,
        ego = FALSE, traces = FALSE, verb = 1)
btlm(X, Z, XX = NULL, bprior = "bflat", tree = c(0.25, 2, 10), 
	BTE = c(2000, 7000, 2), R = 1, m0r1 = FALSE, 
	pred.n = TRUE, ds2x = FALSE, ego=FALSE, traces = FALSE,
        verb = 1)
bgp(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", 
	BTE = c(1000, 4000, 2), R = 1, m0r1 = FALSE, 
	pred.n = TRUE, ds2x = FALSE, ego = FALSE, nu = 1.5,
        traces = FALSE, verb = 1)
bgpllm(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", 
	gamma=c(10,0.2,0.7), BTE = c(1000, 4000, 2), R = 1, 
	m0r1 = FALSE, pred.n = TRUE, ds2x = FALSE,
        ego = FALSE, nu = 1.5, traces = FALSE, verb = 1)
btgp(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", 
	tree = c(0.25, 2, 10), BTE = c(2000, 7000, 2), R = 1, 
	m0r1 = FALSE, linburn = FALSE, pred.n = TRUE, 
	ds2x = FALSE, ego = FALSE, nu = 1.5, traces = FALSE,
        verb = 1)
btgpllm(X, Z, XX = NULL, bprior = "bflat", corr = "expsep", 
	tree = c(0.25, 2, 10), gamma=c(10,0.2,0.7), 
	BTE = c(2000, 7000, 2), R = 1, m0r1 = FALSE, 
	linburn = FALSE, pred.n = TRUE, ds2x = FALSE,
        ego = FALSE, nu = 1.5, traces = FALSE, verb = 1)

Arguments

data.frame, matrix, or vector of inputs X

Vector of output responses Z of length equal to the leading dimension (rows) of X, i.e., length(Z) == dim(X)[1]

Optional data.frame, matrix, or vector of predictive input locations with the same number of columns as X, i.e., dim(XX)[2] == dim(X)[2]

bprior

Linear (beta) prior, default is "bflat"; alternates include "b0" hierarchical Normal prior, "bmle" empirical Bayes Normal prior, "bcart" Bayesian linear CART style prior from Chipman et a

tree

3-vector of tree process prior parameterization c(alpha, beta, nmin) specifying $$p_{\mbox{\tiny split}}(\eta, \mathcal{T}) = \alpha*(1+\eta)^\beta$$ giving zero probability to trees with partitions containing less than

gamma

Limiting linear model parameters c(g, t1, t2), with growth parameter g > 0 minimum parameter t1 >= 0 and maximum parameter t1 >= 0, where

t1 + t2 <= 1<="" code=""> specifies
	$$p(b|d)=t_1 +
	  \e

corr

Gaussian process correlation model. Choose between the isotropic power exponential family ("exp") or the separable power exponential family ("expsep", default); the current version also supports the isotropic Matern (

BTE

3-vector of Monte-carlo parameters (B)urn in, (T)otal, and (E)very. Predictive samples are saved every E MCMC rounds starting at round B, stopping at T.

Number of repeats or restarts of BTE MCMC rounds, default R=1 is no restarts

m0r1

If TRUE the responses Z will be scaled to have a mean of zero and a range of 1; default is FALSE

linburn

If TRUE initializes MCMC with B (additional) rounds of Bayesian Linear CART (btlm); default is FALSE

pred.n

TRUE (default) value results in prediction at the inputs X; FALSE skips prediction at X resulting in a faster implementation

ds2x

TRUE results in ALC (Active Learning--Cohn) computation of expected reduction in uncertainty calculations at the X locations, which can be used for adaptive sampling; FALSE (default) skips this computatio

ego

TRUE results in EGO (Expected Global Optimization) computation of expected information about the location of the minimum reduction in uncertainty calculations at the XX locations, which can be used for adaptive sampli

beta functionality: fixed smoothness parameter for the Matern correlation function; nu+0.5 times differentiable

traces

TRUE results in a saving of samples from the posterior distribution for most of the parameters in the model. The default is FALSE for speed/storage reasons. See note below

verb

Level of verbosity of R-console print statments: from 0 (none); 1 (default) which shows the progress meter; 2 includes an echo of initialization parameters; up to 3 and 4 (max) with more info about successful tree operations

Value

bgp returns an object of class "tgp". The function plot.tgp can be used to help visualize results.
An object of class "tgp" is a list containing at least the following components... The final two (parts & trees) are tree-related outputs unique to the T (tree) category functions. Tree viewing is supported by tgp.trees
stateunsigned short[3] random number seed to C
XInput argument: data.frame of inputs X
nNumber of rows in X, i.e., dim(X)[1]
dNumber of cols in X, i.e., dim(X)[2]
ZVector of output responses Z
XXInput argument: data.frame of predictive locations XX
nnNumber of rows in XX, i.e., dim(XX)[1]
BTEInput argument: Monte-carlo parameters
RInput argument: restarts
linburnInput argument: initialize MCMC with linear CART
paramslist of model parameters generated by tgp.default.params and passed to tgp
dparamsDouble-representation of model input parameters used by the C-code
Zp.meanVector of mean predictive estimates at X locations
Zp.q1Vector of 5% predictive quantiles at X locations
Zp.q2Vector of 95% predictive quantiles at X locations
Zp.qVector of quantile norms Zp.q2-Zp.q1
ZZ.q1Vector of 5% predictive quantiles at XX locations
ZZ.q2Vector of 95% predictive quantiles at XX locations
ZZ.qVector of quantile norms ZZ.q2-ZZ.q1, used by the Active Learning--MacKay (ALM) adaptive sampling algorithm
Ds2xIf argument ds2x=TRUE, this vector contains ALC statistics for XX locations
egoIf argument ego=TRUE, this vector contains EGO statistics for XX locations
responseName of response Z if supplied by data.frame in argument, or "z" if none provided
partsInternal representation of the regions depicted by partitions of the maximum a' posteriori (MAP) tree
treeslist of trees (maptree representation) which were MAP as a function of each tree height sampled between MCMC rounds B and T
tracesIf input argument traces=TRUE, this is list containing traces of most of the model parameters and posterior predictive distributions at input locations XX. Otherwise the entry is FALSE. See note below
verbInput argument: verbosity level

Details

The functions and their arguments can be categorized by whether or not they use treed partitioning (T), GP models, and jumps to the LLM

lll{ blm - Linear Model btlm T Linear CART bgp GP GP Regression bgpllm GP, LLM GP with jumps to the LLM btgp T, GP treed GP Regression btgpllm T, GP, LLM treed GP with jumps to the LLM }

Each function implements a special case of the generic function tgp which is an interface to C/C++ code for treed Gaussian process modeling of varying parameterization. For each of the examples, below, see help(tgp) for the direct tgp implementation. Only functions in the T (tree) category take the tree argument; GP category functions take the corr argument; and LLM category functions take the gamma argument. Non-tree class functions omit the parts and trees outputs, see below Please see vignette("tgp") for detailed illustration

References

Gramacy, R. B., Lee, H. K. H. (2006). Bayesian treed Gaussian process models. Available as UCSC Technical Report ams2006-01.

Gramacy, R. B., Lee, H. K. H. (2006). Adaptive design of supercomputer experiments. Available as UCSC Technical Report ams2006-02.

Chipman, H., George, E., & McCulloch, R. (1998). Bayesian CART model search (with discussion). Journal of the American Statistical Association, 93, 935--960.

Chipman, H., George, E., & McCulloch, R. (2002). Bayesian treed models. Machine Learning, 48, 303--324.

http://www.ams.ucsc.edu/~rbgramacy/tgp.html

Examples

Run this code

##
## Many of the examples below illustrate the above 
## function(s) on random data.  Thus it can be fun
## (and informative) to run them several times.
##

# 
# simple linear response
#

# input and predictive data
X <- seq(0,1,length=50)
XX <- seq(0,1,length=99)
Z <- 1 + 2*X + rnorm(length(X),sd=0.25)

out <- blm(X=X, Z=Z, XX=XX)	# try Linear Model
plot(out)			# plot the surface

#
# 1-d Example
# 

# construct some 1-d nonstationary data
X <- seq(0,20,length=100)
XX <- seq(0,20,length=99)
Z <- (sin(pi*X/5) + 0.2*cos(4*pi*X/5)) * (X <= 9.6)
lin <- X>9.6; 
Z[lin] <- -1 + X[lin]/10
Z <- Z + rnorm(length(Z), sd=0.1)

out <- btlm(X=X, Z=Z, XX=XX) 	# try Linear CART
plot(out) 			# plot the surface
tgp.trees(out) 		 	# plot the MAP trees

out <- btgp(X=X, Z=Z, XX=XX) 	# use a treed GP
plot(out) 			# plot the surface
tgp.trees(out) 		 	# plot the MAP trees


#
# 2-d example
# (using the isotropic correlation function)
#

# construct some 2-d nonstationary data
exp2d.data <- exp2d.rand()
X <- exp2d.data$X; Z <- exp2d.data$Z
XX <- exp2d.data$XX

# try a GP
out <- bgp(X=X, Z=Z, XX=XX, corr="exp") 	
plot(out) 			# plot the surface

# try a treed GP LLM
out <- btgpllm(X=X, Z=Z, XX=XX, corr="exp") 
plot(out) 			# plot the surface
tgp.trees(out) 		 	# plot the MAP trees


#
# Motorcycle Accident Data
#

# get the data
# and scale the response to zero mean and a rage of 1 (m0r1)
require(MASS)

# try a GP 
out <- bgp(X=mcycle[,1], Z=mcycle[,2], m0r1=TRUE)
plot(out)			# plot the surface

# try a treed GP LLM
# best to use the "b0" beta linear prior to capture common
# common linear process throughout all regions
out <- btgpllm(X=mcycle[,1], Z=mcycle[,2], bprior="b0", 
	       m0r1=TRUE)
plot(out)			# plot the surface
tgp.trees(out)		 	# plot the MAP trees

# Actually, instead of using m0r1, the mcycle data is best fit
# with using a mixture prior for the nugget due to its input-
# dependent noise.  See the examples for the tgp function

# for other examples try the demos or the vignette

Run the code above in your browser using DataLab