autologistic: Fit a centered autologistic model using maximum pseudolikelihood estimation or MCMC for Bayesian inference.

Description

Fit a centered autologistic model using maximum pseudolikelihood estimation or MCMC for Bayesian inference.

Usage

autologistic(formula, data, A, method = c("pl", "bayes"),
    optit = 1000, model = TRUE, x = FALSE, y = FALSE,
    type = c("SOCK", "PVM", "MPI", "NWS"), bootit = 1000,
    parallel = TRUE, nodes, trainit = 1e+05, tol = 0.01,
    minit = 10000, maxit = 1e+06, sigma = 1e+06,
    eta.max = 2)

Arguments

formula

an object of class formula: a symbolic description of the model to be fitted.

data

an optional data frame, list, or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are ta

the adjacency matrix for the underlying graph, which is assumed to be undirected and free of loops and parallel edges.

method

the method to use for inference. pl (the default) enables maximum pseudolikelihood estimation, and bayes enables Bayesian inference.

optit

the maximum number of iterations to be used by optim in obtaining the MPLE estimate of $\theta$. Defaults to 1,000.

parallel

(PL) a boolean variable indicating whether to use parallel bootstrapping, which requires the snow package. Defaults to TRUE, in which case the number of nodes must be supplied.

nodes

(PL) the number of nodes to use for parallel bootstrapping.

type

(PL) the type of cluster to use for parallel bootstrapping. The available types are SOCK, PVM, MPI, and NWS. The default type i

bootit

(PL) the desired size of the bootstrap sample. Defaults to 1,000.

trainit

(Bayes) the number of iterations to use for estimating the posterior covariance matrix. Defaults to 100,000.

tol

(Bayes) a tolerance. If all Monte Carlo standard errors are smaller than tol, no more samples are drawn from the posterior. Defaults to 0.01.

minit

(Bayes) the minimum sample size. This should be large enough to permit accurate estimation of Monte Carlo standard errors. Defaults to 10,000.

maxit

(Bayes) the maximum sample size. Sampling from the posterior terminates when all Monte Carlo standard errors are smaller than tol or when maxit samples have been drawn, whichever happens first. Defaults to 1,000,000.

sigma

(Bayes) a scalar or a $(p-1)$-vector providing the variance(s) of the spherical normal prior for $\beta$. Defaults to 1,000,000.

eta.max

(Bayes) the upper limit for $\eta$. Defaults to 2. The lower limit is 0.

model

a logical value indicating whether the model frame should be included as a component of the returned value.

a logical value indicating whether the model matrix used in the fitting process should be returned as a component of the returned value.

a logical value indicating whether the response vector used in the fitting process should be returned as a component of the returned value.

Value

autologistic returns an object of class autologistic, which is a list containing the following components.
coefficientsthe point estimate of $\theta$.
fitted.valuesthe fitted mean values, obtained by transforming the linear predictors by the inverse of the link function.
linear.predictorsthe linear fit on link scale.
residualsthe response residuals.
iterthe size of the bootstrap/posterior sample.
samplean iter by $p$ matrix containing the bootstrap/posterior samples.
mcsea $p$-vector of Monte Carlo standard errors.
V(Bayes) the estimated posterior covariance matrix from the training run.
accept(Bayes) the acceptance rate for the MCMC sampler.
yif requested (the default), the y vector used.
Xif requested, the model matrix.
modelif requested (the default), the model frame.
callthe matched call.
formulathe formula supplied.
methodthe method used for inference.
convergencean integer code. The code has value 0 if optim succeeded in optimizing the pseudolikelihood. Possible error codes are 1 and 10. The former indicates that the iteration limit was reached before optimization completed. The latter indicates that the Nelder-Mead simplex degenerated.
messagea character string to go along with convergence equal to 1 or 10.
termsthe terms object used.
datathe data argument.
xlevels(where relevant) a record of the levels of the factors used in fitting.

Details

This function fits the centered autologistic model of Caragea and Kaiser (2009) using maximum pseudolikelihood estimation or MCMC for Bayesian inference. The joint distribution for the centered autologistic model is $$\pi(Z\mid\theta)=c(\theta)^{-1}\exp\left(Z^\prime X\beta - \eta Z^\prime A\mu + \frac{\eta}{2}Z^\prime AZ\right),$$ where $\theta = (\beta^\prime, \eta)^\prime$ is the parameter vector, $c(\theta)$ is an intractable normalizing function, $Z$ is the response vector, $X$ is the design matrix, $\beta$ is a $(p-1)$-vector of regression coefficients, $A$ is the adjacency matrix for the underlying graph, $\mu$ is the vector of independence expectations, and $\eta$ is the spatial dependence parameter. Maximum pseudolikelihood estimation sidesteps the intractability of $c(\theta)$ by maximizing the product of the conditional likelihoods. Confidence intervals are then obtained using a parametric bootstrap. The bootstrap datasets are generated by perfect sampling (rautologistic). The bootstrap samples can be generated in parallel using the snow package. Bayesian inference is obtained using the auxiliary variable algorithm of Moller et al. (2006). The auxiliary variables are generated by perfect sampling. The prior distributions are (1) zero-mean normal with independent coordinates for $\beta$, and (2) uniform for $\eta$. The variance(s) for the normal prior can be supplied by the user. The default is a common variance of 1,000,000. The uniform prior has support [0, 2] by default, but the right endpoint can be supplied (as eta.max) by the user. The posterior covariance matrix of $\theta$ is estimated using samples obtained during a training run. The default number of iterations for the training run is 100,000, but this can be controlled by the user (via argument trainit). The estimated covariance matrix is then used as the proposal variance for a Metropolis-Hastings random walk. The proposal distribution is normal. The posterior samples obtained during the second run are used for inference. The length of the run can be controlled by the user via arguments minit, maxit, and tol. The first determines the minimum number of iterations. If minit has been reached, the sampler will terminate when maxit is reached or all Monte Carlo standard errors are smaller than tol, whichever happens first.

References

Caragea, P. and Kaiser, M. (2009) Autologistic models with interpretable parameters. Journal of Agricultural, Biological, and Environmental Statistics, 14(3), 281--300.

Hughes, J., Haran, M. and Caragea, P. C. (2011) Autologistic models for binary data on a lattice. Environmetrics, 22(7), 857--871.

Moller, J., Pettitt, A., Berthelsen, K., and Reeves, R. (2006) An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. Biometrika, 93(2), 451--458.