The krigeProblem class provides functionality for kriging using
distributed calculations, based on maximum likelihood estimation. The class includes methods for standard kriging calculations and metadata necessary for carrying out the methods in a distributed fashion. To carry out kriging calculations, one must first initialize an object
of the krigeProblem class. This is done using
krigeProblem$new and help on initialization can be obtained via
krigeProblem$help('initialize') (but noting that the call is
krigeProblem$new not krigeProblem$initialize).
Note that in what follows I refer to observation and prediction
'locations'. This is natural for spatial problems, but for non-spatial
problems, 'locations' is meant to refer to the points within the
relevant domain at which observations are available and predictions
wish to be made.
The user must provide functions that create the subsets of the mean
vector(s) and the covariance matrix/matrices. Functions for the mean
vector and covariance matrix for observation locations are required,
while those
for the mean vector for prediction locations, the cross-covariance
matrix (where the first column is the index of the observation
locations and the second of the prediction locations), and the
prediction covariance matrix for prediction locations are required
when doing prediction and posterior simulation. These functions should
follow the form of SN2011fe_meanfunc,
SN2011fe_predmeanfunc, SN2011fe_covfunc,
SN2011fe_predcovfunc, and SN2011fe_crosscovfunc. Namely,
they should take three arguments, the first a vector of all the
parameters for the Gaussian process (both mean and covariance),
the second an arbitrary list of inputs (in general this would include
the observation and prediction locations), and the third being
indices, which will be provided by the package and will differ between
slave processes. For the
mean functions, the indices will be a vector, indicating which of the
vector elements are stored on a given process. For the covariance
functions, the indices will be a two column matrix, with each row
a pair of indices (row, column), indicating the elements of the matrix
stored on a given process. Thus, the user-provided functions should use the second
and third arguments to construct the elements of the vectors/matrices
belonging on the slave process. Note that the elements of the
matrices are stored as vectors (vectorizing matrices column-wise, as
natural for column-major matrices). Users can simply have their
functions operate on the rows of the index matrix without worrying
about ordering. An optional fourth argument contains cached values that need
not be computed at every call to the user-provided function. If the
user wants to make use of caching of values to avoid expensive
recomputation, the user function should mimic
SN2011fe_covfunc. That is, when the user wishes to change the cached
values (including on first use of the function), the function should return
a two-element list, with the first element being the covariance matrix
elements and the second containing whatever object is to be
cached. This cached object will be provided to the function on
subsequent calls as the fourth argument.
Note that one should have all necessary packages required for
calculation of the mean vector(s) and covariance matrix/matrices installed
on all machines used and the names of these packages should be passed
as the packages argument to the krigeProblem initialization.
Help for the various methods of the class can be obtained with
krigeProblem$help('methodName') and a list of fields and
methods in the class with krigeProblem$help().
In general, n (or n1 and n2) refer to the length
or number of rows/columns of vectors and matrices and h (or
h1 and h2) to the block replication factor for these
vectors and matrices. More details on block replication factors can be
found in the references in references; these are set at
reasonable values automatically, and for simplicity, one can set them
at one, in which case the number of blocks into which the primary
covariance matrix is split is $P$, the number of slave
processes. Cross-covariance matrices returned to the user will have
number of rows equal
to the number of observation locations and number of columns to the
number of prediction locations. Matrices of realizations will have
each realized field as a single column.