Last chance! 50% off unlimited learning
Sale ends in
predict.sarlm()
calculates predictions as far as is at present
possible for for spatial simultaneous autoregressive linear
model objects, using Haining's terminology for decomposition into
trend, signal, and noise --- see reference.predict.sarlm(object, newdata = NULL, listw = NULL,
zero.policy = FALSE, ...)
print.sarlm.pred(x, ...)
sarlm
object returned by lagsarlm
or
errorsarlm
listw
object created for example by nb2listw
lagsarlm()
to
terminate with an errorpredict.sarlm()
returns a vector of predictions with two attribute
vectors of trend and signal values with class sarlm.pred
.
print.sarlm.pred
is a print function for this class, printing and
returning a data frame with columns: "fit", "trend" and "signal".The function approaches prediction first by dividing invocations between those with or without newdata. When no newdata is present, the response variable may be reconstructed as the sum of the trend, the signal, and the noise (residuals). Since the values of the response variable are known, their spatial lags are used to calculate signal components (Cressie 1993, p. 564). For the error model, trend = $X \beta$, and signal = $\lambda W y - \lambda W X \beta$. For the lag and mixed models, trend = $X \beta$, and signal = $\rho W y$.
This approach differs from the design choices made in other software, for example GeoDa, which does not use observations of the response variable, and corresponds to the newdata situation described below.
When however newdata is used for prediction, no observations of the response variable being predicted are available. Consequently, while the trend components are the same, the signal cannot take full account of the spatial smooth. In the error model, the signal is set to zero, since the spatial smooth is expressed in terms of the error: $(I - \lambda W)^{-1} \varepsilon$.
In the lag model, the signal can be expressed in the following way:
giving a feasible signal component of:
setting the error term to zero. This also means that predictions of the signal component for lag and mixed models require the inversion of an n-by-n matrix.
Because the outcomes of the spatial smooth on the error term are unobservable, this means that the signal values for newdata are incomplete. In the mixed model, the spatially lagged RHS variables influence both the trend and the signal, so that the root mean square prediction error in the examples below for this case with newdata is smallest, although the model was not the best fit
errorsarlm
, lagsarlm
data(oldcol)
COL.lag.eig <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD, nb2listw(COL.nb))
COL.mix.eig <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD, nb2listw(COL.nb),
type="mixed")
COL.err.eig <- errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD, nb2listw(COL.nb))
print(p1 <- predict(COL.mix.eig))
print(p2 <- predict(COL.mix.eig, newdata=COL.OLD, listw=nb2listw(COL.nb)))
AIC(COL.mix.eig)
sqrt(deviance(COL.mix.eig)/length(COL.nb))
sqrt(sum((COL.OLD$CRIME - as.vector(p1))^2)/length(COL.nb))
sqrt(sum((COL.OLD$CRIME - as.vector(p2))^2)/length(COL.nb))
AIC(COL.err.eig)
sqrt(deviance(COL.err.eig)/length(COL.nb))
sqrt(sum((COL.OLD$CRIME - as.vector(predict(COL.err.eig)))^2)/length(COL.nb))
sqrt(sum((COL.OLD$CRIME - as.vector(predict(COL.err.eig, newdata=COL.OLD,
listw=nb2listw(COL.nb))))^2)/length(COL.nb))
AIC(COL.lag.eig)
sqrt(deviance(COL.lag.eig)/length(COL.nb))
sqrt(sum((COL.OLD$CRIME - as.vector(predict(COL.lag.eig)))^2)/length(COL.nb))
sqrt(sum((COL.OLD$CRIME - as.vector(predict(COL.lag.eig, newdata=COL.OLD,
listw=nb2listw(COL.nb))))^2)/length(COL.nb))
Run the code above in your browser using DataLab