georob fits a linear model with spatially correlated
errors to geostatistical data that are possibly contaminated by
independent outliers. The regression coefficients and the parameters of
the variogram model are estimated by robust or Gaussian restricted
maximum likelihood (REML) or by Gaussian maximum likelihood (ML).georob(formula, data, subset, weights, na.action, model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, offset, locations, variogram.model = c("RMexp", "RMaskey", "RMbessel", "RMcauchy", "RMcircular", "RMcubic", "RMdagum", "RMdampedcos", "RMdewijsian", "RMfbm", "RMgauss", "RMgencauchy", "RMgenfbm", "RMgengneiting", "RMgneiting", "RMlgd", "RMmatern", "RMpenta", "RMqexp", "RMspheric", "RMstable", "RMwave", "RMwhittle"), param, fit.param = default.fit.param()[names(param)],
aniso = default.aniso(), fit.aniso = default.fit.aniso(), tuning.psi = 2, control = control.georob(), verbose = 0, ...)SpatialPointsDataFrame, list or environment (or another
object coercible by as.data.frame to a data frame)
containing the variables in the model and the coordinates where the data
was recorded. If not found in data, the variables are taken from
environment(formula), typically the environment from which
georob is called.NAs. The default is set by the na.action
argument of options, and is na.fail if that is
unset. The factory-fresh default is na.omit.
Another possible value is NULL, no action. Value
na.exclude can be useful.TRUE the corresponding components
of the fit (the model frame, the model matrix, the response) are
returned. The model frame is augmented by the coordinates.contrasts.arg of
model.matrix.default.offset term can be included in the formula
instead or as well, and if both are specified their sum is used.RMmodel).param
are matched against the following names (see Details and
georobIntro for information about the parametrization of
variogram models):
variance: variance (sill $\sigma^2$) of the
auto-correlated component of the Gaussian random field
$B(s)$.
snugget: variance
(spatial nugget $\sigma^2_n$)
of the seemingly spatially uncorrelated component of
$B(s)$
(micro-scale spatial variation; default value snugget = 0).
nugget: variance (nugget $\tau^2$) of the
independent errors
$\epsilon(s)$.
scale: range parameter ($\alpha$) of the variogram.
RMmodel and
param.names).
default.fit.param that creates this vector) with the same
names as used for param, defining which parameters are adjusted
(TRUE) and which are kept fixed at their initial values
(FALSE) when fitting the model.default.aniso that creates this vector) for fitting
geometrically anisotropic variogram models. The names of aniso
are matched against the following names (see Details and
georobIntro for information about the parametrization of
variogram models):
f1: ratio $f_1$ of lengths of second and first
semi-principal axes of an ellipsoidal surface with constant
semivariance in $R^3$ (default f1 = 1).
f2: ratio $f_2$ of lengths of third and first
semi-principal axes of the semivariance ellipsoid (default f2 = 1).
omega: azimuth in degrees of the first semi-principal axis
of the semivariance ellipsoid (default omega = 90).
phi: 90 degrees minus altitude of the first semi-principal axis
of the semivariance ellipsoid (default phi = 90).
zeta: angle in degrees between the second semi-principal
axis and the direction of the line defined by the intersection
between the $x$-$y$-plane and the plane orthogonal to the
first semi-principal axis of the semivariance ellipsoid through the
origin (default zeta = 0).
default.fit.aniso that creates this vector) with the same
names as used for aniso, defining which parameters are adjusted
(TRUE) and which are kept fixed at their initial values
(FALSE) when fitting the model.georob. Use the function control.georob and see its
help page for the components of control.verbose = 0 largely
suppresses such messages and verbose = 4 asks for most verbose
output (see control arguments of nleqslv,
nlminb and optim and
control.georob for information how to fine tuning
diagnostic output generated by nleqslv, nlminb and
optim).object.
used internally for updating georob objects). georob representing a robust (or Gaussian) (RE)ML
fit of a spatial linear model. See
georobObject for the components of the fit.georob fits a spatial linear model by robust or Gaussian RE(ML)
(Künsch et al., 2011, Künsch
et al., in preparation). georobIntro describes the
employed model and briefly sketches the robust REML estimation and the
robust external-drift kriging method. Here, we describe further details
of georob.
Implemented variogram models
Currently, most basic variogram models provided by the
package RandomFields can be fitted by
georob (see argument variogram.model for a list of
implemented models). Some of these models have in addition to
variance, snugget, nugget and scale further
parameters. Initial values of these parameters (param) and
fitting flags (fit.param) must be passed to georob by the
same names as used by the functions RM... of the package
RandomFields (see RMmodel. Use the
function param.names to list additional parameters of a
given variogram.model.
Estimation of variance of micro-scale variation
Simultaneous estimation of the variance of the micro-scale variation
(snugget, $sigma^2_n$), which appears
as seemingly uncorrelated with a given sampling design, and of the
variance (nugget, $\tau^2$) of the independent errors
requires that for some locations
$s_i$ replicated observations are
available. Locations less or equal than zero.dist apart are
thereby considered as being coincient (see
control.georob).
Fitting intrinsic variogram models
The intrinsic variogram model RMfbm is overparametrized when
both the variance (plus possibly snugget) and the
scale are fitted. Therefore, to estimate the parameters of this
model scale must be kept fixed at an arbitray value by using
fit.param["scale"] = FALSE.
Fitting geometrically anisotropic variogram models
The subsection Model of georobIntro describes
how such models are parametrized and gives definitions the various
elements of aniso. Some additional remarks might be helpful:
scale ($\alpha$).
aniso ($f_1=1$, $f_2=1$)
define an isotropic variogram model.
param.bounds and RMmodel
for allowed ranges). georob uses three mechanisms to constrain
parameter estimates to permissible ranges:
variance, snugget, nugget), the range
scale and the anisotropy parameters f1 and
f2 are log-transformed before solving the estimating
equations or maximizing the restricted loglikelihood and this
warrants that the estimates are always positive (see
control.georob for controlling parameter
transformations).
nleqslv or optim if
the current trial values are invalid. These functions then
graciously update the trial values of the parameters and carry
their task on. However, it is clear that such a procedure likely
gets stuck at a point on the boundary of the parameter space and is
therefore just a workaround for avoiding runtime errors due to
invalid parameter values.
nlminb and
optim: If a spatial model is fitted non-robustly, then the
arguments lower, upper (and method of
optim) can be used to constrain the parameters
(see control.optim how to pass them to optim).
For optim one has to use the arguments method =
"L-BFGS-B", lower = l, upper = u, where
l and u are numeric vectors with the lower and upper
bounds of the transformed parameters in the order as they
appear in c(variance, snugget, nugget, scale,
...)[fit.param], aniso[fit.aniso]), where ... are
additional parameters of isotropic variogram models (use
param.names(variogram.model) to display the names and the
order of the additional parameters for variogram.model).
georob by the
argument bhat of control.georob.
lmrob, rq or
non-robustly by lm (see argument
initial.fixef of control.georob).
Finding the roots of the robustified estimating equations of the
variogram and anisotropy parameters is more sensitive to a good choice
of initial values than maximizing the Gaussian (restricted)
loglikelihood with respect to the same parameters. If the initial
values for param and aniso are not sufficiently close to
the roots of the system of nonlinear equations, then
nleqslv may fail to find them.
Setting
initial.param = TRUE allows one to find initial values that are
often sufficiently close to the roots so that
nleqslv converges. This is achieved by:
lmrob irrespective of the choice for
initial.fixef (see control.georob).
lmrob fit, satisfying
$\psi_c(r_i/s)/(r_i/s)
<=min.rweight$, are="" discarded="" (see="" control.georob).
=min.rweight$,>optim.
param, aniso) are used as initial estimates for the
subsequent robust fit of the model by nleqslv.
Note that for step 3 above, initial values of param and
aniso must be provided to georob.
Estimating variance parameters by Gaussian (RE)ML
Unlike robust REML, where robustified estimating equations are solved
for the variance parameters nugget ($\tau^2$),
variance ($\sigma^2$), and possibly snugget
($\sigma_n^2$), for Gaussian (RE)ML the
variances can be reparametrized to
georob maximizes then a (restricted) profile
loglikelihood that depends only on $\eta$, $\xi$,
$\alpha$, ..., and $\sigma_Z^2$ is estimated by an explicit
expression that depends on these parameters (e.g. Diggle and Ribeiro,
2006, p. 113). This is ususally more efficient than maximizing the
(restricted) loglikelihood with respect to the original variance
parameters $\tau^2$, $\sigma_n^2$ and
$\sigma^2$. georob chooses the parametrization
automatically but the user can control it by the argument
reparam of the function control.georob.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (in preparation) Robust Geostatistics.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (2011) Robust estimation of the external drift and the variogram of spatial data. Proceedings of the ISI 58th World Statistics Congress of the International Statistical Institute. http://e-collection.library.ethz.ch/eserv/eth:7080/eth-7080-01.pdf
georobIntro for a description of the model and a brief summary of the algorithms;
georobObject for a description of the class georob;
plot.georob for display of RE(ML) variogram estimates;
control.georob for controlling the behaviour of georob;
cv.georob for assessing the goodness of a fit by georob;
predict.georob for computing robust kriging predictions; and finally
georobModelBuilding for stepwise building models of class georob;
georobMethods for further methods for the class georob.
## Not run:
# ################
# ## meuse data ##
# ################
# data(meuse)
#
# ## Gaussian REML fit
# r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
# variogram.model = "RMexp",
# param = c(variance = 0.15, nugget = 0.05, scale = 200),
# tuning.psi = 1000)
# summary(r.logzn.reml, correlation = TRUE)
#
# ## robust REML fit
# r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1)
#
# summary(r.logzn.rob, correlation = TRUE)
#
# plot(r.logzn.reml, lag.dist.def = seq(0, 2000, by = 100))
# lines(r.logzn.rob, col = "red")
#
#
# ###################
# ## wolfcamp data ##
# ###################
# data(wolfcamp, package = "geoR")
# d.wolfcamp <- data.frame(x = wolfcamp[[1]][,1], y = wolfcamp[[1]][,2],
# pressure = wolfcamp[[2]])
#
# ## fitting isotropic IRF(0) model
#
# r.irf0.iso <- georob(pressure ~ 1, data = d.wolfcamp, locations = ~ x + y,
# variogram.model = "RMfbm",
# param = c(variance = 10, nugget = 1500, scale = 1, alpha = 1.5),
# fit.param = default.fit.param(scale = FALSE, alpha = TRUE),
# tuning.psi = 1000)
#
# summary(r.irf0.iso)
#
# ## fitting anisotropic IRF(0) model
#
# r.irf0.aniso <- georob(pressure ~ 1, data = d.wolfcamp, locations = ~ x + y,
# variogram.model = "RMfbm",
# param = c(variance = 5.9, nugget = 1450, scale = 1, alpha = 1),
# fit.param = default.fit.param(scale = FALSE, alpha = TRUE),
# aniso = default.aniso(f1 = 0.51, omega = 148.),
# fit.aniso = default.fit.aniso(f1 = TRUE, omega = TRUE),
# tuning.psi = 1000)
# summary(r.irf0.aniso)
#
# plot(r.irf0.iso, lag.dist.def = seq(0, 200, by = 7.5))
# plot(r.irf0.aniso, lag.dist.def = seq(0, 200, by = 7.5),
# xy.angle.def = c(0, 22.5, 67.5, 112.5, 157.5, 180.),
# add = TRUE, col = 2:5)
#
# pchisq(2*(r.irf0.aniso[["loglik"]] - r.irf0.iso[["loglik"]]), 2, lower = FALSE)## End(Not run)
Run the code above in your browser using DataLab