BRISC_decorrelation: Function for decorrelating data with BRISC

Description

The function BRISC_decorrelation is used to decorrelate data (known structure) using Nearest Neighbor Gaussian Processes (NNGP). BRISC_decorrelation uses the sparse Cholesky representation of Vecchia’s likelihood developed in Datta et al., 2016. Some code blocks are borrowed from the R package: spNNGP: Spatial Regression Models for Large Datasets using Nearest Neighbor Gaussian Processes https://CRAN.R-project.org/package=spNNGP .

Usage

BRISC_decorrelation(coords, sim, sigma.sq = 1, tau.sq = 0,
                    phi = 1, nu = 1.5, n.neighbors = NULL,
                    n_omp = 1, cov.model = "exponential",
                    search.type = "tree",
                    stabilization = NULL, verbose = TRUE,
                    tol = 12)

Value

A list comprising of the following:

coords: the matrix coords.
n.neighbors: the used value of n.neighbors.
cov.model: the used covariance model.
Theta: parameters of covarinace model; accounts for stabilization.
input.data: if stabilization = FALSE, return the matrix sim. If stabilization = TRUE, returns sim + used white noise in stabilization process.
output.data: the output matrix $g$ in Details.
time: time (in seconds) required after preprocessing data in R,
reported using, proc.time().

Arguments

coords: an $n \times 2$ matrix of the observation coordinates in $R^2$ (e.g., easting and northing).
sim: an $n \times k$ matrix of the $k$ many $n \times 1$ vectors from which the decorrelated data are calculated (see Details below).
sigma.sq: value of sigma square. Default value is 1.
tau.sq: value of tau square. Default value is 0.1.
phi: value of phi. Default value is 1.
nu: value of nu, only required for Matern covariance model. Default value is 1.5.
n.neighbors: number of neighbors used in the NNGP. Default value is $max(100, n -1)$. We suggest a high value of n.neighbors for lower value of phi.
n_omp: number of threads to be used, value can be more than 1 if source code is compiled with OpenMP support. Default is 1.
cov.model: keyword that specifies the covariance function to be used in modelling the spatial dependence structure among the observations. Supported keywords are: "exponential", "matern", "spherical", and "gaussian" for exponential, Matern, spherical and Gaussian covariance function respectively. Default value is "exponential".
search.type: keyword that specifies type of nearest neighbor search algorithm to be used. Supported keywords are: "brute", "tree" and "cb".
"brute" and "tree" provide the same result, though "tree" should be faster. "cb" implements fast code book search described in Ra and Kim (1993) modified for NNGP. If locations do not have identical coordinate values on the axis used for the nearest neighbor determination, then "cb" and "brute" should produce identical neighbor sets. However, if there are identical coordinate values on the axis used for nearest neighbor determination, then "cb" and "brute" might produce different, but equally valid neighbor sets, e.g., if data are on a grid. Default value is "tree".
stabilization: when the correlated data are generated from a very smooth covarince model (lower values of phi for spherical and Gaussian covariance and low phi and high nu for Matern covarinace), the decorrelation process may fail due to computational instability. If stabilization = TRUE, performs stabilization by adding a white noise to the data with nugget tau.sq = sigma.sq * 1e-06. Default value is TRUE for cov.model = "expoenential" and FALSE otherwise.
verbose: if TRUE, model specifications along with information regarding OpenMP support and progress of the algorithm is printed to the screen. Otherwise, nothing is printed to the screen. Default value is TRUE.
tol: the input observation coordinates are rounded to this many places after the decimal. The default value is 12.

Author

Arkajyoti Saha arkajyotisaha93@gmail.com,
Abhirup Datta abhidatta@jhu.edu

Details

Denote $h$ be the input sim. Let $\Sigma$ be the covariance matrix associated with the covariance model determined by the $cov.model$ and model parameters. Then BRISC_decorrelation calculates $g$, where $g$ is given as follows: $$ S ^{-0.5} h = g $$ where, $S ^{-0.5}$ is a sparse approximation of the cholesky factor $\Sigma ^{-0.5}$ of the precision matrix $\Sigma ^{-1}$, obtained from NNGP.

References

Datta, A., S. Banerjee, A.O. Finley, and A.E. Gelfand. (2016) Hierarchical Nearest-Neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association, 111:800-812.

Andrew Finley, Abhirup Datta and Sudipto Banerjee (2017). spNNGP: Spatial Regression Models for Large Datasets using Nearest Neighbor Gaussian Processes. R package version 0.1.1. https://CRAN.R-project.org/package=spNNGP

Examples

Run this code

rmvn <- function(n, mu = 0, V = matrix(1)){
  p <- length(mu)
  if(any(is.na(match(dim(V),p))))
    stop("Dimension not right!")
  D <- chol(V)
  t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p)))
}

set.seed(1)
n <- 1000
coords <- cbind(runif(n,0,1), runif(n,0,1))

sigma.sq = 1
phi = 1

set.seed(1)
D <- as.matrix(dist(coords))
R <- exp(-phi*D)
sim <- rmvn(3, rep(0,n), sigma.sq*R)
decorrelation_result <- BRISC_decorrelation(coords, sim = sim)

Run the code above in your browser using DataLab