spNNGP: Function for fitting univariate Bayesian spatial regression models

Description

The function spNNGP fits Gaussian univariate Bayesian spatial regression models using Nearest Neighbor Gaussian Processes (NNGP).

Usage

spNNGP(formula, data = parent.frame(), coords, method = "response", n.neighbors = 15,
      starting, tuning, priors, cov.model = "exponential",
      n.samples, n.omp.threads = 1, search.type = "tree",
      return.neighbors = FALSE, verbose = TRUE, n.report = 100, ...)

Arguments

formula

a symbolic description of the regression model to be fit. See example below.

data

an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which spNNGP is called.

coords

an \(n \times 2\) matrix of the observation coordinates in \(R^2\) (e.g., easting and northing).

method

a quoted keyword that specifies the NNGP sampling algorithm. Supported method keywords are: "response" and "sequential". When \(n\) is large, e.g., greater than 100k, the "response" algorithm should be faster. In general, unless estimates of spatial random effects are needed, the "response" algorithm should be used. See below for details.

n.neighbors

number of neighbors used in the NNGP.

starting

a list with each tag corresponding to a parameter name. Valid tags are beta, sigma.sq, tau.sq, phi, and nu. nu is only specified if cov.model="matern". The value portion of each tag is the parameter's startingvalue.

tuning

a list with each tag corresponding to a parameter name. Valid tags are sigma.sq, tau.sq, phi, and nu. If method="sequential" then only phi and nu need to be specified. The value portion of each tag defines the variance of the Metropolis sampler Normal proposal distribution.

priors

a list with each tag corresponding to a parameter name. Valid tags are sigma.sq.ig, tau.sq.ig, phi.unif, and nu.unif. Variance parameters, simga.sq and tau.sq, are assumed to follow an inverse-Gamma distribution, whereas the spatial decay phi and smoothness nu parameters are assumed to follow Uniform distributions. The hyperparameters of the inverse-Gamma are passed as a vector of length two, with the first and second elements corresponding to the shape and scale, respectively. The hyperparameters of the Uniform are also passed as a vector of length two with the first and second elements corresponding to the lower and upper support, respectively.

cov.model

a quoted keyword that specifies the covariance function used to model the spatial dependence structure among the observations. Supported covariance model key words are: "exponential", "matern", "spherical", and "gaussian". See below for details.

n.samples

the number of posterior samples to collect.

n.omp.threads

a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting n.omp.threads up to the number of hyperthreaded cores.

verbose

if TRUE, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.

search.type

a quoted keyword that specifies type of nearest neighbor search algorithm. Supported method key words are: "tree" and "brute" both will yield the same solution but "tree" should be much faster.

return.neighbors

if TRUE, a list containing the indices for each location's nearest neighbors will be returned along with ordered data used to fit a NNGP model. This argument should typically be FALSE. See n.indx below for more details.

n.report

the interval to report Metropolis sampler acceptance and MCMC progress.

...

currently no additional arguments.

Value

An object of class rNNGP or sNNGP depending on the method, which is a list comprising:

p.beta.samples

a coda object of posterior samples for the regression coefficients.

p.theta.samples

a coda object of posterior samples for covariance parameters.

p.w.samples

is a matrix of posterior samples for the spatial random effects, where rows correspond to locations in coords and columns hold the n.samples posterior samples. This is only returned if method="sequential".

n.indx

if return.neighbors=TRUE then n.indx will be a list of length \(n\). The i-th element in the list corresponds to the i-th row in coords.ord matrix and the elements are the nearest neighbor indices for the given location.

ord

the vector order(coords[,1]), which is the vector of indices used to order data necessary for fitting the NNGP model.

coords.ord

the matrix coords[ord,].

y.ord

the vector y[ord].

X.ord

the matrix X[ord,,drop=FALSE].

run.time

execution time for building the nearest neighbor index and MCMC sampler reported using proc.time().

The return object will include additional objects used for subsequent prediction and/or model fit evaluation.

Details

Model parameters can be fixed at their starting values by setting their tuning values to zero.

The no nugget model is specified by setting tau.sq to zero in the starting and tuning lists.

References

Datta, A., S. Banerjee, A.O. Finley, and A.E. Gelfand. (2016) Hierarchical Nearest-Neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association, 111:800-812

Finley, A.O., A. Datta, B.C. Cook, D.C. Morton, H.E. Andersen, and S. Banerjee (2017) Applying Nearest Neighbor Gaussian Processes to massive spatial data sets: Forest canopy height prediction across Tanana Valley Alaska, https://arxiv.org/abs/1702.00434v2.

Examples

Run this code

# NOT RUN {
rmvn <- function(n, mu=0, V = matrix(1)){
  p <- length(mu)
  if(any(is.na(match(dim(V),p))))
    stop("Dimension problem!")
  D <- chol(V)
  t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p)))
}

##Make some data
set.seed(1)
n <- 100
coords <- cbind(runif(n,0,1), runif(n,0,1))

x <- cbind(1, rnorm(n))

B <- as.matrix(c(1,5))

sigma.sq <- 5
tau.sq <- 1
phi <- 3/0.5

D <- as.matrix(dist(coords))
R <- exp(-phi*D)
w <- rmvn(1, rep(0,n), sigma.sq*R)
y <- rnorm(n, x%*%B + w, sqrt(tau.sq))

##Fit a Response and Sequential NNGP model
n.samples <- 500

starting <- list("phi"=phi, "sigma.sq"=5, "tau.sq"=1)

tuning <- list("phi"=0.5, "sigma.sq"=0.5, "tau.sq"=0.5)

priors <- list("phi.Unif"=c(3/1, 3/0.01), "sigma.sq.IG"=c(2, 5), "tau.sq.IG"=c(2, 1))

cov.model <- "exponential"

m.s <- spNNGP(y~x-1, coords=coords, starting=starting, method="sequential", n.neighbors=10,
              tuning=tuning, priors=priors, cov.model=cov.model,
              n.samples=n.samples, n.omp.threads=2)

round(summary(m.s$p.beta.samples)$quantiles[,c(3,1,5)],2)
round(summary(m.s$p.theta.samples)$quantiles[,c(3,1,5)],2)
plot(apply(m.s$p.w.samples, 1, median), w)

m.r <- spNNGP(y~x-1, coords=coords, starting=starting, method="response", n.neighbors=10,
              tuning=tuning, priors=priors, cov.model=cov.model,
              n.samples=n.samples, n.omp.threads=2)

round(summary(m.r$p.beta.samples)$quantiles[,c(3,1,5)],2)
round(summary(m.r$p.theta.samples)$quantiles[,c(3,1,5)],2)

# }

Run the code above in your browser using DataLab