georobSimulation: Simulating Realizations of Gaussian Processes from Object of Class `georob`

Description

This page documents the function condsim that simulates (un)conditional realizations of Gaussian processes from the parameters of a spatial linear model estimated by the function georob.

Usage

condsim(object, newdata, nsim, seed, 
    type =  c("response", "signal"), locations, trend.coef = NULL, 
    variogram.model = NULL, param = NULL, aniso = NULL, variogram.object = NULL,  
    control = control.condsim(), verbose = 0)
    
  control.condsim(use.grid = FALSE, grid.refinement = 2.,
    condsim = TRUE, include.data.sites = FALSE, means = FALSE,
    trend.covariates = FALSE, covariances = FALSE,
    ncores = detectCores(), pcmp = control.pcmp())

Arguments

object

an object of class georob (mandatory argument), see georobObject.

newdata

a mandatory data frame, SpatialPointsDataFrame, SpatialPixelsDataFrame, SpatialGridDataFrame, SpatialPoints, SpatialPixels or SpatialGrid object, with the coordinates of points for which simulations are computed and in which to look for variables required for computing fitted values or Kriging predictions, see predict.georob.

nsim

number of (condititional) realizations to compute (mandatory argument).

seed

integer seed to initialize random number generation, see set.seed (mandatory argument).

type

character keyword defining what target quantity should be simulated. Possible values are

"signal": the “signal” $Z(\mbox{\boldmath$s$\unboldmath}) = \mbox{\boldmath$x$\unboldmath}(\mbox{\boldmath$s$\unboldmath})^\mathrm{T} \mbox{\boldmath$\beta$\unboldmath} + B(\mbox{\boldmath$s$\unboldmath})$ of the process,
"response": the observations $Y(\mbox{\boldmath$s$\unboldmath}) = Z(\mbox{\boldmath$s$\unboldmath}) + \varepsilon(\mbox{\boldmath$s$\unboldmath}),$ (default),

see georobIntro for details on the model specification.

locations

an optional one-sided formula specifying what variables of newdata are the coordinates of the points for which simulated values are computed (default: object[["locations.objects"]][["locations"]]).

trend.coef

an optional numeric vector with the coefficients of the trend model to be used for computing the (conditional) mean function of the random processes.

variogram.model

an optional character keyword defining the variogram model to be used for the simulations, see georob and Details.

param

an optional named numeric vector with values of the variogram parameters used for the simulations, see georob and Details.

aniso

an optional named numeric vector with values of anisotropy parameters of a variogram used for the simulations, see georob and Details.

variogram.object

an optional list that defines a possibly nested variogram model used for the simulations, see georob and Details.

control

a list with the components use.grid, grid.refinement, condsim, include.data.sites, means, trend.covariates, covariances, ncores, and pcmp or a function such as control.condsim that generates such a list, see arguments of control.consim.georob for details.

verbose

positive integer controlling logging of diagnostic messages to the console. verbose = 0 (default) suppresses such messages.

use.grid

logical scalar (default FALSE) to control whether (conditional) realizations are computed for a rectangular grid instead of the coordinates of points contained in newdata, see Details.

grid.refinement

numeric scalar that defines a factor by which the minimum differences of the coordinates between any pair of points in newdata are divided to setup the simulation grid, should be > 1 (default 2), see Details.

condsim

logical scalar (default TRUE) to control whether conditional (TRUE) or unconditional simulations (FALSE) are computed.

include.data.sites

logical scalar, to control whether (conditionally) simulated values are also computed for the points of the original data set used to estimated the model parameters.

means

logical scalar, to control whether the (un)conditional means are included in the output.

trend.covariates

logical scalar, to control whether the covariates required for the trend model are included in the output.

covariances

logical scalar, to control whether the covariances between the points of the original data set used to estimate the model parameters (attr gcvmat.d.d, compressed matrix) and the covariances between the simulation and the original data points (attr gcvmat.s.d, matrix) are returned as attributes of the output. Note that these covariances are only returned if use.grid == TRUE & condsim == TRUE.

ncores

positive integer controlling how many cores are used for parallelized computations, defaults to all cores.

pcmp

a list of arguments, passed e.g. to pmm or a function such as control.pcmp that generates such a list (see control.pcmp for allowed arguments).

Value

The output generated by condsim is an object of a ``similar'' class as newdata (data frame, SpatialPointsDataFrame, SpatialPixelsDataFrame, SpatialGridDataFrame, SpatialPolygonsDataFrame).

The data frame or the data slot of the Spatial...DataFrame objects have the following components:

the coordinates of the prediction points (only present if newdata is a data frame).
expct: optionally the (un)conditional means.
optionally the covariates required for the trend model.
sim.1, sim.2, ...: the (un)conditionally simulated realizations.

Details

condsim (conditionally) simulates from a Gaussian process that has a linear mean function with parameters $\mbox{\boldmath$\beta$\unboldmath}$ and an auto-correlation structure characterized by a parametric variogram model and variogram parameters $\tau^2$ and $\mbox{\boldmath$\theta$\unboldmath}$ (see georobIntro for the employed parametrization of the spatial linear model). The parameters of the mean and auto-correlation function are either taken from the the spatial linear model estimated by georob and passed by the argument object to condsim or from the optional arguments trend.coef ($\mbox{\boldmath$\beta$\unboldmath}$) and variogram.model, param, aniso or variogram.object ($\tau^2$, $\mbox{\boldmath$\theta$\unboldmath}$). Note that in the former case the uncertainty in the estimated mean and auto-correlation parameters is not taken into account.

Simulated values are computed for the points in newdata by the function RFsimulate of the package RandomFields. Both unconditional and conditional simulations can be computed. In the latter cases, the simulated values are always conditioned to the response data used to fit the spatial linear model by georob and contained in object.

Unconditional simulation

Unconditional realizations are either computed for the exact locations of the points in newdata (use.grid == FALSE), irrespective of the fact whether these are arranged on a regular grid, or for the (approximate) locations of the points in newdata matched to a rectangular simulation grid (use.grid == TRUE). The latter approach may be substantially faster for large problems because the simulation algorithm implemented in RFsimulate for grids is faster than for arbitrary geometries of the simulation points.

For use.grid == TRUE, a rectangular grid is constructed from the coordinates of the points in newdata and object. The spacing of the grid is equal to the minimum differences of the coordinates between any pair of points in newdata, divided by grid.refinement. The data related to the points in newdata (covariates for the trend model) and of the data in object (response values, covariates) are then assigned to the nodes of the grid that are closest to the respective points. If the same grid node is assigned to several points in newdata (or in object) then the data of the respective points are averaged. If the same node is assigned to a point in object and newdata then the point in object is kept and the concerned point in newdata is omitted.

Conditional simulation

Simulations are conditioned to data either by exploiting the respective built-in functionality of RFsimulate (use.grid == FALSE) or by the Kriging method (use.grid == TRUE, see Chil<U+008F>s and Delfiner, 1999, sec. 7.3). The latter approach may again be faster for large problems because it exploits the larger speed of unconditional simulations for rectangular grids.

Parallelized computations

condsim uses the packages parallel, snow and snowfall for parallelized computation of simulations. If there are $m$ realizations to simulate, the task is split into ceiling(m/ncores) sub-tasks that are then distributed to ncores CPUs. Evidently, ncores = 1 suppresses parallel execution. By default, the function uses all available CPUs as returned by detectCores.

References

Chil<U+008F>s, J.-P. and Delfiner, P. (1999) Geostatistics: Modeling Spatial Uncertainty, John Wiley & Sons, New York.

Examples

Run this code

# NOT RUN {
  
data(meuse)
data(meuse.grid)

## convert to SpatialGridDataFrame
meuse.grid.sgdf <- meuse.grid
coordinates(meuse.grid.sgdf) <- ~ x + y
gridded(meuse.grid.sgdf) <- TRUE
fullgrid(meuse.grid.sgdf) <- TRUE

## Gaussian REML fit
r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
    variogram.model = "RMexp",
    param = c(variance = 0.15, nugget = 0.05, scale = 200),
    tuning.psi = 1000,
    control = control.georob(cov.bhat = TRUE, cov.ehat.p.bhat = TRUE))
    
## Conditional simulations
r.sim <- condsim(r.logzn.reml, newdata = meuse.grid.sgdf, nsim = 100, seed = 1)
str(r.sim, max=2)

## Display
spplot(r.sim, zcol = "sim.1", at = seq(3.5, 8.5, by = 0.5))
spplot(r.sim, zcol = "sim.2", at = seq(3.5, 8.5, by = 0.5))

library(lattice)
levelplot(sim.1 ~ x + y, as.data.frame(r.sim), aspect = "iso", at = seq(3.5, 8.5, by = 0.5),
  panel = function(x, y, z, subscripts, data.points, ... ){
    panel.levelplot( x, y, z, subscripts, ...)
    panel.xyplot(data.points$x, data.points$y, col = 1)
  }, data.points = meuse[, c("x", "y")]
)
levelplot(sim.2 ~ x + y, as.data.frame(r.sim), aspect = "iso", at = seq(3.5, 8.5, by = 0.5),
  panel = function(x, y, z, subscripts, data.points, ... ){
    panel.levelplot( x, y, z, subscripts, ...)
    panel.xyplot(data.points$x, data.points$y, col = 1)
  }, data.points = meuse[, c("x", "y")]
)

# }

Run the code above in your browser using DataLab