exp2d.rand: Random 2-d Exponential Data

Description

A Random subsample of data(exp2d), or Latin Hypercube sampled data evaluated with exp2d.Z

Usage

exp2d.rand(n1 = 50, n2 = 30, lh = NULL, dopt = 1)

Arguments

Number of samples from the first, interesting, quadrant

Number of samples from the other three, uninteresting, quadrants

If !is.null(lh) then Latin Hypercube (LH) sampling (lhs) is used instead of subsampling from data(exp2d); lh should be a single nonnegative integer specifying the desired number of predictive locations, XX; or, it should be a vector of length 4, specifying the number of predictive locations desired from each of the four quadrants (interesting quadrant first, then counter-clockwise)

dopt

If dopt >= 2 then d-optimal subsampling from LH candidates of the multiple indicated by the value of dopt will be used. This argument only makes sense when !is.null(lh)

Value

Output is a list with entries:

2-d data.frame with n1 + n2 input locations

Numeric vector describing the responses (with noise) at the X input locations

Ztrue

Numeric vector describing the true responses (without noise) at the X input locations

2-d data.frame containing the remaining 441 - (n1 + n2) input locations

Numeric vector describing the responses (with noise) at the XX predictive locations

ZZtrue

Numeric vector describing the responses (without noise) at the XX predictive locations

Details

When is.null(lh), data is subsampled without replacement from data(exp2d). Of the n1 + n2 <= 441 input/response pairs X,Z, there are n1 are taken from the first quadrant, i.e., where the response is interesting, and the remaining n2 are taken from the other three quadrants. The remaining 441 - (n1 + n2) are treated as predictive locations

Otherwise, when !is.null(lh), Latin Hypercube Sampling (lhs) is used

If dopt >= 2 then n1*dopt LH candidates are used for to get a D-optimal subsample of size n1 from the first (interesting) quadrant. Similarly n2*dopt in the rest of the un-interesting region. A total of lh*dopt candidates will be used for sequential D-optimal subsampling for predictive locations XX in all four quadrants assuming the already-sampled X locations will be in the design.

In all three cases, the response is evaluated as $$Z(X)=x_1 * \exp(x_1^2-x_2^2).$$ thus creating the outputs Ztrue and ZZtrue. Zero-mean normal noise with sd=0.001 is added to the responses Z and ZZ

References

Gramacy, R. B. (2007). tgp: An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models. Journal of Statistical Software, 19(9). https://www.jstatsoft.org/v19/i09

Gramacy, R. B., Lee, H. K. H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103(483), pp. 1119-1130. Also available as ArXiv article 0710.4536 https://arxiv.org/abs/0710.4536

https://bobby.gramacy.com/r_packages/tgp/

Examples

Run this code

# NOT RUN {
## randomly subsampled data
## ------------------------

eds <- exp2d.rand()

# higher span = 0.5 required because the data is sparse
# and was generated randomly
eds.g <- interp.loess(eds$X[,1], eds$X[,2], eds$Z, span=0.5)

# perspective plot, and plot of the input (X & XX) locations
par(mfrow=c(1,2), bty="n")
persp(eds.g, main="loess surface", theta=-30, phi=20,
      xlab="X[,1]", ylab="X[,2]", zlab="Z")
plot(eds$X, main="Randomly Subsampled Inputs")
points(eds$XX, pch=19, cex=0.5)

## Latin Hypercube sampled data
## ----------------------------

edlh <- exp2d.rand(lh=c(20, 15, 10, 5))

# higher span = 0.5 required because the data is sparse
# and was generated randomly
edlh.g <- interp.loess(edlh$X[,1], edlh$X[,2], edlh$Z, span=0.5)

# perspective plot, and plot of the input (X & XX) locations
par(mfrow=c(1,2), bty="n")
persp(edlh.g, main="loess surface", theta=-30, phi=20,
      xlab="X[,1]", ylab="X[,2]", zlab="Z")
plot(edlh$X, main="Latin Hypercube Sampled Inputs")
points(edlh$XX, pch=19, cex=0.5)

# show the quadrants
abline(h=2, col=2, lty=2, lwd=2)
abline(v=2, col=2, lty=2, lwd=2)


# }
# NOT RUN {
## D-optimal subsample with a factor of 10 (more) candidates
## ---------------------------------------------------------

edlhd <- exp2d.rand(lh=c(20, 15, 10, 5), dopt=10)

# higher span = 0.5 required because the data is sparse
# and was generated randomly
edlhd.g <- interp.loess(edlhd$X[,1], edlhd$X[,2], edlhd$Z, span=0.5)

# perspective plot, and plot of the input (X & XX) locations
par(mfrow=c(1,2), bty="n")
persp(edlhd.g, main="loess surface", theta=-30, phi=20,
      xlab="X[,1]", ylab="X[,2]", zlab="Z")
plot(edlhd$X, main="D-optimally Sampled Inputs")
points(edlhd$XX, pch=19, cex=0.5)

# show the quadrants
abline(h=2, col=2, lty=2, lwd=2)
abline(v=2, col=2, lty=2, lwd=2)
# }

Run the code above in your browser using DataLab