Empirical Bayes estimation for SGLMM
ebsglmm(formula, family = c("gaussian", "binomial", "poisson", "Gamma",
"GEV.binomial", "GEVD.binomial", "Wallace.binomial"), data, weights, subset,
atsample, parskel, paroptim, corrfcn = c("matern", "spherical",
"powerexponential"), Nout, Nthin = 1, Nbi = 0, Npro, Nprt = 1,
Nprb = 0, betm0, betQ0, ssqdf, ssqsc, zstart, dispersion = 1,
bfsize1 = 0.8, reference = 1, bfmethod = c("RL", "MW"),
transf = FALSE, useCV = TRUE, longlat = FALSE, control = list(),
verbose = TRUE)
A representation of the model in the form
response ~ terms
. The response must be set to NA
's
at the prediction locations (see the example in
mcsglmm
for how to do this using
stackdata
). At the observed locations the response
is assumed to be a total of replicated measurements. The number of
replications is inputted using the argument weights
.
The distribution of the data. The
"GEVbinomial"
family is the binomial family with link the
GEV link (see Details).
An optional data frame containing the variables in the model.
An optional vector of weights. Number of replicated samples for Gaussian and gamma, number of trials for binomial, time length for Poisson.
An optional vector specifying a subset of observations to be used in the fitting process.
A formula in the form ~ x1 + x2 + ... + xd
with the coordinates of the sampled locations.
A data frame with the components "linkp", "phi",
"omg", and "kappa", corresponding to the link function, the
spatial range, the relative nugget, and the spatial smoothness
parameters. The latter can be omitted if not used in the
correlation function. Let k denote the number of rows. Then, k
different MCMC samples will be taken from the models with
parameters fixed at those values. For a square grid the output
from the function expand.grid
can be used
here.
A named list with the components "linkp", "phi", "omg", "kappa". Each component must be numeric with length 1, 2, or 3 with elements in increasing order but for the binomial family linkp is also allowed to be the character "logit" and "probit". If its length is 1, then the corresponding parameter is considered to be fixed at that value. If 2, then the two numbers denote the lower and upper bounds for the optimisation of that parameter (infinities are allowed). If 3, these correspond to lower bound, starting value, upper bound for the estimation of that parameter.
Spatial correlation function. See Details.
A scalar or vector of size k. Number of MCMC samples
to take for each run of the MCMC algorithm for the estimation of
the Bayes factors. See argument parskel
.
A scalar or vector of size k. The thinning of the MCMC algorithm for the estimation of the Bayes factors.
A scalar or vector of size k. The burn-in of the MCMC algorithm for the estimation of the Bayes factors.
A scalar. The number of Gibbs samples to take for estimation of the conjugate parameters and for prediction at the unsampled locations while the other parameters are fixed at their empirical Bayes estimates.
The thinning of the Gibbs algorithm for the estimation of the conjugate parameters and for prediction.
The burn-in of the Gibbs algorithm for the estimation of the conjugate parameters and for prediction.
Prior mean for beta (a vector or scalar).
Prior standardised precision (inverse variance) matrix. Can be a scalar, vector or matrix. The first two imply a diagonal with those elements. Set this to 0 to indicate a flat improper prior.
Degrees of freedom for the scaled inverse chi-square prior for the partial sill parameter.
Scale for the scaled inverse chi-square prior for the partial sill parameter.
Optional starting value for the MCMC for the GRF.
This can be either a scalar, a vector of size n where n is the
number of sampled locations, or a matrix with dimensions n by k
where k is the number of the skeleton points in parskel
.
The fixed dispersion parameter.
A scalar or vector of length k with all integer values or all values in (0, 1]. How many samples (or what proportion of the sample) to use for estimating the Bayes factors at the first stage. The remaining sample will be used for estimating the Bayes factors in the second stage. Setting it to 1 will perform only the first stage.
An integer between 1 and k. Which model to be used as a reference, i.e. the one that goes in the denominator of the Bayes factors.
Which method to use to calculate the Bayes factors: Reverse logistic or Meng-Wong.
Whether to use the transformed sample mu for the computations. Otherwise it uses z.
Whether to use control variates for finer corrections.
How to compute the distance between locations. If
FALSE
, Euclidean distance, if TRUE
Great Circle
distance. See spDists
.
A list of control parameters for the optimisation.
See optim
.
Whether to print messages when completing each stage on screen.
A list with components
parest
The parameter estimates
skeleton
The skeleton points used with the corresponding
logarithm of the Bayes factors at those points.
optim
The output from the optim
function.
mcmcsample
The MCMC samples for the remaining
parameters and the random field. These samples correspond to the
Gibbs and Metropolis-Hasting samples after fixing the parameters
estimated by empirical Bayes at their empirical Bayes estimates.
sys_time
The time taken to complete the MCMC
sampling, calculation of the importance weights, the
optimization and the final MCMC sampling.
Currently the following spatial correlation functions are implemented. Below, \(h\) denotes the distance between locations, \(d\) is the dimensionality of the locations, \(\phi\) is the spatial range parameter and \(\kappa\) is an additional parameter. The correlation \(r(u)\) beween locations with distance \(u\) apart is
$$r(h) = \frac{1}{2^{\kappa-1}\Gamma(\kappa)}(\frac{h}{\phi})^\kappa K_{\kappa}(\frac{h}{\phi})$$
$$r(h) = \left\{ \begin{array}{ll} 1 - 1.5\frac{h}{\phi} + 0.5(\frac{h}{\phi})^3 \mbox{ , if $h$ < $\phi$} \cr 0 \mbox{ , otherwise} \end{array} \right.$$ Note that this is a valid correlation only for \(d \leq 3\).
$$r(h) = \exp\{-(\frac{h}{\phi})^\kappa\} $$ Note that this is a valid correlation only for \(0 < \kappa \leq 2\).
The GEV (Generalised Extreme Value) link is defined by $$\mu = 1 - \exp\{-\max(0, 1 + \nu x)^{\frac{1}{\nu}}\}$$ for any real \(\nu\). At \(\nu = 0\) it reduces to the complementary log-log link.
Roy, V., Evangelou, E., and Zhu, Z. (2015). Efficient estimation and prediction for the Bayesian spatial generalized linear mixed model with flexible link functions. Biometrics. http://dx.doi.org/10.1111/biom.12371
# NOT RUN {
data(rhizoctonia)
### Define the model
corrf <- "spherical"
kappa <- 0
ssqdf <- 1
ssqsc <- 1
betm0 <- 0
betQ0 <- .01
### Skeleton points
philist <- c(100,140,180)
linkp <- "logit"
omglist <- c(0,.5,1)
parlist <- expand.grid(phi = philist, linkp = linkp, omg = omglist,
kappa = kappa)
paroptim <- list(linkp = linkp, phi = c(100, 200), omg = c(0, 2),
kappa = kappa)
### MCMC sizes
Nout <- Npro <- 100
Nthin <- Nprt <- 1
Nbi <- Nprb <- 0
est <- ebsglmm(Infected ~ 1, 'binomial', rhizoctonia, weights = Total,
atsample = ~ Xcoord + Ycoord, parskel = parlist,
paroptim = paroptim, corrfcn = corrf,
Nout = Nout, Nthin = Nthin, Nbi = Nbi,
Npro = Npro, Nprt = Nprt, Nprb = Nprb,
betm0 = betm0, betQ0 = betQ0, ssqdf = ssqdf, ssqsc = ssqsc,
dispersion = 1, useCV=TRUE)
# }
Run the code above in your browser using DataLab