lagsarlm
Spatial simultaneous autoregressive lag model estimation
The lagsarlm
function provides Maximum likelihood estimation of spatial simultaneous autoregressive lag and spatial Durbin (mixed) models of the form:
$$y = \rho W y + X \beta + \varepsilon$$
where $rho$ is found by optimize()
first, and $beta$ and other parameters by generalized least squares subsequently (onedimensional search using optim performs badly on some platforms). In the spatial Durbin (mixed) model, the spatially lagged independent variables are added to X. Note that interpretation of the fitted coefficients should use impact measures, because of the feedback loops induced by the data generation process for this model. With one of the sparse matrix methods, larger numbers of observations can be handled, but the interval=
argument may need be set when the weights are not rowstandardised.
The spBreg_lag
function is an earlyrelease version of the Matlab Spatial Econometrics Toolbox function sar_g.m
, using drawing by inversion, and not accommodating heteroskedastic disturbances.
 Keywords
 spatial
Usage
lagsarlm(formula, data = list(), listw,
na.action, type="lag", method="eigen", quiet=NULL,
zero.policy=NULL, interval=NULL, tol.solve=1.0e10, trs=NULL,
control=list())
spBreg_lag(formula, data = list(), listw, na.action, type="lag", zero.policy=NULL, control=list())
Arguments
 formula
 a symbolic description of the model to be fit. The details
of model specification are given for
lm()
 data
 an optional data frame containing the variables in the model. By default the variables are taken from the environment which the function is called.
 listw
 a
listw
object created for example bynb2listw
 na.action
 a function (default
options("na.action")
), can also bena.omit
orna.exclude
with consequences for residuals and fitted values  in these cases the weights list will be subsetted to remove NAs in the data. It may be necessary to set zero.policy to TRUE because this subsetting may create noneighbour observations. Note that only weights lists created without using the glist argument tonb2listw
may be subsetted.  type
 default "lag", may be set to "mixed"; when "mixed", the lagged intercept is dropped for spatial weights style "W", that is rowstandardised weights, but otherwise included; “Durbin” may be used instead of “mixed”
 method
 "eigen" (default)  the Jacobian is computed as the product
of (1  rho*eigenvalue) using
eigenw
, and "spam" or "Matrix_J" for strictly symmetric weights lists of styles "B" and "C", or made symmetric by similarity (Ord, 1975, Appendix C) if possible for styles "W" and "S", using code from the spam or Matrix packages to calculate the determinant; “Matrix” and “spam_update” provide updating Cholesky decomposition methods; "LU" provides an alternative sparse matrix decomposition approach. In addition, there are "Chebyshev" and Monte Carlo "MC" approximate logdeterminant methods; the Smirnov/Anselin (2009) trace approximation is available as "moments". Three methods: "SE_classic", "SE_whichMin", and "SE_interp" are provided experimentally, the first to attempt to emulate the behaviour of Spatial Econometrics toolbox ML fitting functions. All use grids of log determinant values, and the latter two attempt to ameliorate some features of "SE_classic".  quiet
 default NULL, use !verbose global option value; if FALSE, reports function values during optimization.
 zero.policy
 default NULL, use global option value; if TRUE assign zero to the lagged value of zones without
neighbours, if FALSE (default) assign NA  causing
lagsarlm()
to terminate with an error  interval
 default is NULL, search interval for autoregressive parameter
 tol.solve
 the tolerance for detecting linear dependencies in the columns of matrices to be inverted  passed to
solve()
(default=1.0e10). This may be used if necessary to extract coefficient standard errors (for instance lowering to 1e12), but errors insolve()
may constitute indications of poorly scaled variables: if the variables have scales differing much from the autoregressive coefficient, the values in this matrix may be very different in scale, and inverting such a matrix is analytically possible by definition, but numerically unstable; rescaling the RHS variables alleviates this better than setting tol.solve to a very small value  trs
 default NULL, if given, a vector of powered spatial weights matrix traces output by
trW
; when given, insert the asymptotic analytical values into the numerical Hessian instead of the approximated values; may be used to get around some problems raised when the numerical Hessian is poorly conditioned, generating NaNs in subsequent operations; the use of trs is recommended  control
 list of extra control arguments  see section below
Details
The asymptotic standard error of $rho$ is only computed when
method=eigen, because the full matrix operations involved would be costly
for large n typically associated with the choice of method="spam" or "Matrix". The same applies to the coefficient covariance matrix. Taken as the
asymptotic matrix from the literature, it is typically badly scaled, and with the elements involving $rho$ being very small,
while other parts of the matrix can be very large (often many orders
of magnitude in difference). It often happens that the tol.solve
argument needs to be set to a smaller value than the default, or the RHS variables can be centred or reduced in range.
Versions of the package from 0.438 include numerical Hessian values where asymptotic standard errors are not available. This change has been introduced to permit the simulation of distributions for impact measures. The warnings made above with regard to variable scaling also apply in this case.
Note that the fitted() function for the output object assumes that the response variable may be reconstructed as the sum of the trend, the signal, and the noise (residuals). Since the values of the response variable are known, their spatial lags are used to calculate signal components (Cressie 1993, p. 564). This differs from other software, including GeoDa, which does not use knowledge of the response variable in making predictions for the fitting data.
Value

A list object of class
sarlm
The internal sar.lag.mixed.* functions return the value of the log likelihood function at $rho$.Control arguments
Extra Bayesian control arguments
References
Cliff, A. D., Ord, J. K. 1981 Spatial processes, Pion; Ord, J. K. 1975 Estimation methods for models of spatial interaction, Journal of the American Statistical Association, 70, 120126; Anselin, L. 1988 Spatial econometrics: methods and models. (Dordrecht: Kluwer); Anselin, L. 1995 SpaceStat, a software program for the analysis of spatial data, version 1.80. Regional Research Institute, West Virginia University, Morgantown, WV; Anselin L, Bera AK (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DEA (eds) Handbook of applied economic statistics. Marcel Dekker, New York, pp. 237289; Cressie, N. A. C. 1993 Statistics for spatial data, Wiley, New York; LeSage J and RK Pace (2009) Introduction to Spatial Econometrics. CRC Press, Boca Raton.
Roger Bivand, Gianfranco Piras (2015). Comparing Implementations of Estimation Methods for Spatial Econometrics. Journal of Statistical Software, 63(18), 136. http://www.jstatsoft.org/v63/i18/.
Bivand, R. S., Hauke, J., and Kossowski, T. (2013). Computing the Jacobian in Gaussian spatial autoregressive models: An illustrated comparison of available methods. Geographical Analysis, 45(2), 150179.
See Also
lm
, errorsarlm
,
summary.sarlm
, eigenw
,
predict.sarlm
, impacts.sarlm
,
residuals.sarlm
, do_ldet
Examples
data(oldcol)
COL.lag.eig < lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb, style="W"), method="eigen", quiet=FALSE)
summary(COL.lag.eig, correlation=TRUE)
COL.lag.eig$fdHess
COL.lag.eig$resvar
W < as(nb2listw(COL.nb), "CsparseMatrix")
trMatc < trW(W, type="mult")
COL.lag.eig1 < lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb, style="W"), control=list(fdHess=TRUE), trs=trMatc)
COL.lag.eig1$fdHess
system.time(COL.lag.M < lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb), method="Matrix", quiet=FALSE))
summary(COL.lag.M)
impacts(COL.lag.M, listw=nb2listw(COL.nb))
## Not run:
# system.time(COL.lag.sp < lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
# nb2listw(COL.nb), method="spam", quiet=FALSE))
# summary(COL.lag.sp)
# ## End(Not run)
COL.lag.B < lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb, style="B"))
summary(COL.lag.B, correlation=TRUE)
COL.mixed.B < lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb, style="B"), type="mixed", tol.solve=1e9)
summary(COL.mixed.B, correlation=TRUE)
COL.mixed.W < lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb, style="W"), type="mixed")
summary(COL.mixed.W, correlation=TRUE)
NA.COL.OLD < COL.OLD
NA.COL.OLD$CRIME[20:25] < NA
COL.lag.NA < lagsarlm(CRIME ~ INC + HOVAL, data=NA.COL.OLD,
nb2listw(COL.nb), na.action=na.exclude,
control=list(tol.opt=.Machine$double.eps^0.4))
COL.lag.NA$na.action
COL.lag.NA
resid(COL.lag.NA)
## Not run:
# data(boston)
# gp2mM < lagsarlm(log(CMEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) +
# I(RM^2) + AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT),
# data=boston.c, nb2listw(boston.soi), type="mixed", method="Matrix")
# summary(gp2mM)
# W < as(nb2listw(boston.soi), "CsparseMatrix")
# trMatb < trW(W, type="mult")
# gp2mMi < lagsarlm(log(CMEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) +
# I(RM^2) + AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT),
# data=boston.c, nb2listw(boston.soi), type="mixed", method="Matrix",
# trs=trMatb)
# summary(gp2mMi)
# ## End(Not run)
summary(COL.lag.eig)
COL.lag.Bayes < spBreg_lag(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb, style="W"))
summary(COL.lag.Bayes)
set.seed(1)
summary(impacts(COL.lag.Bayes, tr=trMatc), short=TRUE, zstats=TRUE)
## Not run:
# data(elect80)
# lw < nb2listw(e80_queen, zero.policy=TRUE)
# el_ml < lagsarlm(log(pc_turnout) ~ log(pc_college) + log(pc_homeownership)
# + log(pc_income), data=elect80, listw=lw, zero.policy=TRUE, method="LU")
# summary(el_ml)
# set.seed(1)
# el_B < spBreg_lag(log(pc_turnout) ~ log(pc_college) + log(pc_homeownership)
# + log(pc_income), data=elect80, listw=lw, zero.policy=TRUE)
# summary(el_B)
# el_ml$timings
# attr(el_B, "timings")
# ## End(Not run)