mlegp: mlegp: maximum likelihood estimation of Gaussian process parameters

Description

Finds maximum likelihood estimates of Gaussian process parameters for a vector (or matrix) of one (or more) responses. For multiple responses, the user chooses between fitting independent Gaussian processes to the separate responses or fitting independent Gaussian processes to principle component weights obtained through singular value decomposition of the output. The latter is useful for functional output or data rich situations.

Usage

mlegp(X, Z, constantMean = 1, nugget = NULL, nugget.known = 0, 
      min.nugget = 0, param.names = NULL, gp.names = NULL, 
	  PC.UD = NULL, PC.num = NULL, PC.percent = NULL, 
	  simplex.ntries = 5, simplex.maxiter = 500, simplex.reltol = 1e-8,  
	  BFGS.maxiter = 500, BFGS.tol = 0.01, BFGS.h = 1e-10, seed = 0, 
      verbose = 1, parallel = FALSE)

Arguments

the design matrix

vector or matrix of observations; corresponding to the rows of X

constantMean

a value of 1 indicates that each Gaussian process will have a constant mean; otherwise the mean function will be a linear regression in X, plus an intercept term

nugget

if nugget.known is 1, a fixed value to use for the nugget or a vector corresponding to the fixed diagonal nugget matrix; otherwise, either a positive initial value for the nugget term which will be estimated, or a vector corresponding to the diagonal nugget matrix up to a multiplicative constant. If NULL (the default), mlegp estimates a nugget term only if there are replicates in the design matrix, see details

nugget.known

1 if a plug-in estimate of the nugget will be used; 0 otherwise

min.nugget

minimum value of the nugget term; 0 by default

param.names

a vector of parameter names, corresponding to the columns of X; parameter names are ‘p1’, ‘p2’, ... by default

gp.names

a vector of GP names, corresponding to the GPs fit to each column of Z or each PC weight

PC.UD

the UD matrix if Z is a matrix of principle component weights; see mlegp-svd-functions

PC.num

the number of principle component weights to keep in the singular value decomposition of Z

PC.percent

if not NULL the number of principle component weights kept is the minimum number that accounts for PC.percent of the total variance of the matrix Z

simplex.ntries

the number of simplexes to run

simplex.maxiter

maximum number of evaluations / simplex

simplex.reltol

relative tolerance for simplex method, defaulting to 1e-16

BFGS.maxiter

maximum number of iterations for BFGS method

BFGS.tol

stopping condition for BFGS method is when norm(gradient) < BFGS.tol * max(1, norm(x)), where x is the parameter vector and norm is the Euclidian norm

BFGS.h

derivatives are approximated as [f(x+BFGS.h) - f(x)] / BFGS.h)

seed

the random number seed

verbose

a value of '1' or '2' will result in status updates being printed; a value of '2' results in more information

parallel

if TRUE will fit GPs in parallel to each column of Z, or each set of PC weights; See details

Value

an object of class gp.list if Z has more than 1 column, otherwise an object of class gp

Details

This function calls the C function fitGPFromR which in turn calls fitGP (both in the file fit_gp.h) to fit each Gaussian process.

Separate Gaussian processes are fit to the observations in each column of Z. Maximum likelihood estimates for correlation and nugget parameters are found through numerical methods (i.e., the Nelder-Mead Simplex and the L-BFGS method), while maximum likelihood estimates of the mean regression parameters and overall variance are calculated in closed form (given the correlation and (scaled) nugget parameters). Multiple simplexes are run, and estimates from the best simplex are used as initial values to the gradient (L-BFGS) method.

Gaussian processes are fit to principle component weights by utilizing the singular value decomposition (SVD) of Z, Z = UDVprime. Columns of Z should correspond to a single k-dimensional observation (e.g., functional output of a computer model, evaluated at a particular input)

In the complete SVD, Z is k x m, and r = min(k,m), U is k x r, D is r x r, containing the singular values along the diagonal, and Vprime is r x m. The output Z is approximated by keeping l < r singular values, keeping a UD matrix of dimension k x l, and the Vprime matrix of dimension l x m. Each column of Vprime now contains l principle component weights, which can be used to reconstruct the functional output.

If nugget.known = 1, nugget = NULL, and replicate observations are present, the nugget will be fixed at its best linear unbiased estimate (a weighted average of sample variances). For each column of Z, a GP will be fit to a collection of sample means rather than all observations. This is the recommended approach as it is more accurate and computationally more efficient.

Parallel support is provided through the package snowfall which allows multiple GPs to be fit in parallel. The user must set up the cluster using sfInit and call sfLibrary(mlegp) to load the library onto the slave nodes. Note: GP fitting is not recommended when the number of observations are large (> 100), in which case sequential GP fitting is faster.

References

Santner, T.J. Williams, B.J., Notz, W., 2003. The Design and Analysis of Computer Experiments (New York: Springer).

Heitmann, K., Higdon, D., Nakhleh, C., Habib, S., 2006. Cosmic Calibration. The Astrophysical Journal, 646, 2, L1-L4.

Dancik, GM and Dorman, KS (2008). mlegp: statistical analysis for computer models of biological systems using R. Bioinformatics 24(17), pp. 1966-1967

https://github.com/gdancik/mlegp/

Examples

Run this code

# NOT RUN {
###### fit a single Gaussian process ######
x = -5:5; y1 = sin(x) + rnorm(length(x),sd=.1)
fit1 = mlegp(x, y1)

## summary and diagnostic plots ##
summary(fit1)
plot(fit1)

###### fit a single Gaussian process when replciates are present ######
x = kronecker(-5:5, rep(1,3))
y = x + rnorm(length(x))

## recommended approach: GP fit to sample means; nugget calcualted from sample variances ##
fit1 = mlegp(x,y, nugget.known = 1)

## original approach: GP fit to all observations; look for MLE of nugget ##
fit2 = mlegp(x,y)


###### fit multiple Gaussian processes to multiple observations ######
x = -5:5 
y1 = sin(x) + rnorm(length(x),sd=.1)
y2 = sin(x) + 2*x + rnorm(length(x), sd = .1)
fitMulti = mlegp(x, cbind(y1,y2))

## summary and diagnostic plots ##
summary(fitMulti)
plot(fitMulti)


###### fit multiple Gaussian processes using principle component weights ######

## generate functional output ##
x = seq(-5,5,by=.2)
p = 1:50
y = matrix(0,length(p), length(x))
for (i in p) {
	y[i,] = sin(x) + i + rnorm(length(x), sd  = .01)
}

## we now have 10 functional observations (each of length 100) ##
for (i in p) {
	plot(x,y[i,], type = "l", col = i, ylim = c(min(y), max(y)))
	par(new=TRUE)
}

## fit GPs to the two most important principle component weights ##
numPCs = 2
fitPC = mlegp(p, t(y), PC.num = numPCs)
plot(fitPC) ## diagnostics

## reconstruct the output Y = UDV'
Vprime = matrix(0,numPCs,length(p))
Vprime[1,] = predict(fitPC[[1]])
Vprime[2,] = predict(fitPC[[2]])

predY = fitPC$UD%*%Vprime
m1 = min(y[39,], predY[,39])
m2 = max(y[39,], predY[,39])

plot(x, y[39,], type="l", lty = 1, ylim = c(m1,m2), ylab = "original y" )
par(new=TRUE)
plot(x, predY[,39], type = "p", col = "red", ylim = c(m1,m2), ylab = "predicted y" )

# }
# NOT RUN {
### fit GPs in parallel ###
library(snowfall)
sfInit(parallel = TRUE, cpus = 2, slaveOutfile = "slave.out")
sfLibrary(mlegp)
fitPC = mlegp(p, t(y), PC.num = 2, parallel = TRUE)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab