lagsarlm: Spatial simultaneous autoregressive lag model estimation

Description

Maximum likelihood estimation of spatial simultaneous autoregressive lag and mixed models of the form:

$$y = \rho W y + X \beta + \varepsilon$$

where $\rho$ is found by optimize() first, and $\beta$ and other parameters by generalized least squares subsequently (one-dimensional search using optim performs badly on some platforms). In the mixed model, the spatially lagged independent variables are added to X.

Usage

lagsarlm(formula, data=list(), listw, na.action=na.fail,
  type="lag", method="eigen", quiet=TRUE,
  zero.policy=FALSE, interval = c(-1, 0.999), tol.solve=1.0e-10, 
  tol.opt=.Machine$double.eps^0.5)

Arguments

formula

a symbolic description of the model to be fit. The details of model specification are given for lm()

data

an optional data frame containing the variables in the model. By default the variables are taken from the environment which the function is called.

listw

a listw object created for example by nb2listw

na.action

a function (default na.fail), can also be na.omit or na.exclude with consequences for residuals and fitted values - in these cases the weights list will be subsetted to remove NAs in the data. It may be necessary to

type

default "lag", may be set to "mixed"; when "mixed", the lagged intercept is dropped for spatial weights style "W", that is row-standardised weights, but otherwise included

method

"eigen" (default) - the Jacobian is computed as the product of (1 - rho*eigenvalue) using eigenw, and "spam" or "Matrix" for strictly symmetric weights lists of styles "B" and "C", or made symmetric by similarity (Ord, 1975, Appendix C) if p

quiet

default=TRUE; if FALSE, reports function values during optimization.

zero.policy

if TRUE assign zero to the lagged value of zones without neighbours, if FALSE (default) assign NA - causing lagsarlm() to terminate with an error

interval

search interval for autoregressive parameter when not using method="eigen"; default is c(-1,1); method="Matrix" will attempt to search for an appropriate interval

tol.solve

the tolerance for detecting linear dependencies in the columns of matrices to be inverted - passed to solve() (default=1.0e-10). This may be used if necessary to extract coefficient standard errors (for instance lowering to 1e-12), but errors

tol.opt

the desired accuracy of the optimization - passed to optimize() (default=square root of double precision machine tolerance)

Value

A list object of class sarlm
type"lag" or "mixed"
rhosimultaneous autoregressive lag coefficient
coefficientsGLS coefficient estimates
rest.seasymptotic standard errors if ase=TRUE
LLlog likelihood value at computed optimum
s2GLS residual variance
SSEsum of squared GLS errors
parametersnumber of parameters estimated
lm.modelthe lm object returned when estimating for $\rho=0$
methodthe method used to calculate the Jacobian
callthe call used to create this object
residualsGLS residuals
lm.targetthe lm object returned for the GLS fit
fitted.valuesDifference between residuals and response variable
se.fitNot used yet
formulamodel formula
aseTRUE if method=eigen
LLsif ase=FALSE (for method="spam" or "Matrix"), the log likelihood values of models estimated dropping each of the independent variables in turn, used in the summary function as a substitute for variable coefficient significance tests
rho.seif ase=TRUE, the asymptotic standard error of $\rho$
LMtestif ase=TRUE, the Lagrange Multiplier test for the absence of spatial autocorrelation in the lag model residuals
zero.policyzero.policy for this model
na.action(possibly) named vector of excluded or omitted observations if non-default na.action argument used
The internal sar.lag.mixed.* functions return the value of the log likelihood function at $\rho$.

Details

The asymptotic standard error of $\rho$ is only computed when method=eigen, because the full matrix operations involved would be costly for large n typically associated with the choice of method="spam" or "Matrix". The same applies to the coefficient covariance matrix. Taken as the asymptotic matrix from the literature, it is typically badly scaled, and with the elements involving $\rho$ being very small, while other parts of the matrix can be very large (often many orders of magnitude in difference). It often happens that the tol.solve argument needs to be set to a smaller value than the default, or the RHS variables can be centred or reduced in range.

Note that the fitted() function for the output object assumes that the response variable may be reconstructed as the sum of the trend, the signal, and the noise (residuals). Since the values of the response variable are known, their spatial lags are used to calculate signal components (Cressie 1993, p. 564). This differs from other software, including GeoDa, which does not use knowledge of the response variable in making predictions for the fitting data.

References

Cliff, A. D., Ord, J. K. 1981 Spatial processes, Pion; Ord, J. K. 1975 Estimation methods for models of spatial interaction, Journal of the American Statistical Association, 70, 120-126; Anselin, L. 1988 Spatial econometrics: methods and models. (Dordrecht: Kluwer); Anselin, L. 1995 SpaceStat, a software program for the analysis of spatial data, version 1.80. Regional Research Institute, West Virginia University, Morgantown, WV (www.spacestat.com); Anselin L, Bera AK (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DEA (eds) Handbook of applied economic statistics. Marcel Dekker, New York, pp. 237-289; Cressie, N. A. C. 1993 Statistics for spatial data, Wiley, New York.

Examples

Run this code

data(oldcol)
COL.lag.eig <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="W"), method="eigen", quiet=FALSE)
summary(COL.lag.eig, correlation=TRUE)
system.time(COL.lag.M <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb), method="Matrix", quiet=FALSE))
summary(COL.lag.M, correlation=TRUE)
system.time(COL.lag.M <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb), method="spam", quiet=FALSE))
summary(COL.lag.M, correlation=TRUE)
COL.lag.B <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="B"))
summary(COL.lag.B, correlation=TRUE)
COL.mixed.B <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="B"), type="mixed", tol.solve=1e-9)
summary(COL.mixed.B, correlation=TRUE)
COL.mixed.W <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="W"), type="mixed")
summary(COL.mixed.W, correlation=TRUE)
NA.COL.OLD <- COL.OLD
NA.COL.OLD$CRIME[20:25] <- NA
COL.lag.NA <- lagsarlm(CRIME ~ INC + HOVAL, data=NA.COL.OLD,
 nb2listw(COL.nb), na.action=na.exclude, tol.opt=.Machine$double.eps^0.4)
COL.lag.NA$na.action
COL.lag.NA
resid(COL.lag.NA)

Run the code above in your browser using DataLab