mleGP: Inference for GP correlation parameters

Description

Maximum likelihood/a posteriori inference for (isotropic) lengthscale and nugget parameters, marginally or jointly

Usage

mleGP(gpi, param = c("d", "g"), tmin = 0, tmax = -1, ab = c(0, 0),
      verb = 0)
jmleGP(gpi, drange=c(0,10), grange=c(sqrt(.Machine$double.eps), 1),
      dab=c(0,0), gab=c(0,0), verb=0)
jmleGP.R(gpi, N=100, drange=c(0,10), grange=c(sqrt(.Machine$double.eps), 1),
      dab=c(0,0), gab=c(0,0), verb=0)

Arguments

gpi

a C-side GP object identifier (positive integer); e.g., as returned by newGP

for jmleGP.R, the maximum number of time the pair of margins should be iterated over before determining failed convergence; note that (at this time) jmleGP uses a hard-coded N=100 in its C implementation

param

for mleGP, indicating whether to work on the lengthscale (d) or nugget (g) margin

tmin

for mleGP, smallest value considered for the parameter (param)

tmax

for mleGP, largest value considered for the parameter (param)

drange

for jmleGP, these are c(tmin, tmax) values for the lengthscale parameter; the default values are reasonable for 1-d inputs in the unit interval

grange

for jmleGP, these are c(tmin, tmax) values for the nugget parameter; the default values are reasonable for responses with a range of one

for mleGP, a non-negative 2-vector describing shape and rate parameters to a Gamma prior for the parameter (param); a zero-setting for either value results in no-prior being used (MLE inference); otherwise MAP inference is

dab

for jmleGP, this is ab for the lengthscale parameter

gab

for jmleGP, this is ab for the nugget parameter

verb

a verbosity argument indicating how much information about the optimization steps should be printed to the screen;

verb <= 0<="" code=""> is quiet; for
    jmleGP, a verb - 1 value is passed to the mleGP subrouti

Value

A self-explanatory data.frame is returned containing the values inferred and the number of iterations used. The jmleGP.R function will also show progress details (the values obtained after each iteration over the marginals).
However, the most important output is the modified GP object which retains the setting of the parameters reported on output as a side effect

Details

mleGP performs marginal (or profile) inference for the specified param, either the lengthscale or the nugget; it has a hard-coded 100-max iterations threshold and a sqrt(.Machine$double.eps) tolerance for determining convergence.

jmleGP provides joint inference by iterating over the marginals. The jmleGP.R function is an R-only wrapper around mleGP (which is primarily in C), whereas jmleGP is primarily in C but should give the same output.

By default, the algorithm is a modified Newton method initialized at the previous setting of the parameter(s). Convergence is usually very fast. Sometimes the Newton method fails however (for any of the usual reasons). The mleGP method can detect this, and in such cases it invokes a Brent_fmin call instead - this is the method behind the optimize function.

Note that the gpi object must have been allocated with dK=TRUE; alternatively, one can call buildKGP - however, this is not in the namespace at this time

References

For standard GP inference, refer to any graduate text, e.g., Rasmussen & Williams Gaussian Processes for Machine Learning;

Examples

Run this code

## a simple example with estimated nugget
library(MASS)

## motorcycle data and predictive locations
X <- matrix(mcycle[,1], ncol=1)
Z <- mcycle[,2]

## get sensible ranges
d <- darg(NULL, X)
g <- garg(list(mle=TRUE), Z)

## initialize the model
gpi <- newGP(X, Z, d=d$start, g=g$start, dK=TRUE)

## separate marginal inference (not necessary - for illustration only)
print(mleGP(gpi, "d", d$min, d$max))
print(mleGP(gpi, "g", g$min, g$max))

## joint inference (could skip straight to here)
print(jmleGP(gpi, c(d$min, d$max), c(g$min, g$max)))

## plot the resulting predictive surface
N <- 100
XX <- matrix(seq(min(X), max(X), length=N), ncol=1)
p <- predGP(gpi, XX, lite=TRUE)
plot(X, Z, main="stationary GP fit to motorcycle data")
lines(XX, p$mean, lwd=2)
lines(XX, p$mean+1.96*sqrt(p$s2*p$df/(p$df-2)), col=2, lty=2)
lines(XX, p$mean-1.96*sqrt(p$s2*p$df/(p$df-2)), col=2, lty=2)

Run the code above in your browser using DataLab