lrtestspsur: Likelihood Ratio tests for the specification of spatial SUR models.

Description

The function computes a set of Likelihood Ratio tests, LR, that help the user to select the spatial structure of the SUR model. To achieve this goal, lrtestspsur needs to estimate the SUR models "sim", "slm", "sem", "sdm", and "sarar", using the function spsurml.

The five models listed above are related by a nesting sequence, so they can be compared using the adequate LR tests. The function shows the log-likelihood corresponding to the maximum-likelihood estimates and the sequence of LR tests.

Usage

lrtestspsur(
  Form = NULL,
  data = NULL,
  W = NULL,
  X = NULL,
  Y = NULL,
  time = NULL,
  G = NULL,
  N = NULL,
  Tm = NULL
)

Arguments

Form

An object created with the package Formula that describes the model to be estimated. This model may contain several responses (explained variables) and a varying number of regressors in each equation.

data

An object of class data.frame or a matrix.

A spatial weighting matrix of order (NxN), assumed to be the same for all equations and time periods.

A data matrix of order (NTmGxp) with the observations of the regressors The number of covariates in the SUR model is p = $sum(p_{g})$ where $p_{g}$ is the number of regressors (including the intercept) in the g-th equation, g = 1,...,G). The specification of X is only necessary if not available a Formula and a data frame. Default = NULL.

A column vector of order (NTmGx1), with the observations of the explained variables. The ordering of the data must be (first) equation, (second) time dimension and (third) Cross-sectional/spatial units. The specification of Y is only necessary if not available a Formula and a data frame. Default = NULL.

time

Time variable.

Number of equations.

Number of cross-section or spatial units

Number of time periods.

Value

lrtestspsur, first, prints the value of the estimated log-likelihood for the major spatial specifications. Then, the function shows the values of the LR statistics corresponding to the nested and nesting models compared, together with their associated p-value.

Details

A fundamental result in maximum-likelihood estimation shows that if model A is nested in model B, by a set of n restrictions on the parameters of model B, then, as the sample size increases, the test statistic: $-2log[l(H_{0}) / l(H_{A})]$ is a $\chi^{2}(n)$, being l(H_0 the estimated likelihood under the null hypothesis (model A) and l(H_A the estimated likelihood under the alternative hypothesis (model B).

The list of (spatial) models that can be estimated with the function spsurml includes the following (in addition to the "slx" and "sdem"):

"sim": SUR model with no spatial effects $$ y_{tg} = X_{tg} \beta_{g} + \epsilon_{tg} $$
"slm": SUR model with spatial lags of the explained variables $$y_{tg} = \lambda_{g} Wy_{tg} + X_{tg} \beta_{g} + \epsilon_{tg} $$
"sem": SUR model with spatial errors $$ y_{tg} = X_{tg} \beta_{g} + u_{tg} $$ $$ u_{tg} = \rho_{g} Wu_{tg} + \epsilon_{tg} $$
"sdm": SUR model of the Spatial Durbin type $$ y_{tg} = \lambda_{g} Wy_{tg} + X_{tt} \beta_{g} + WX_{tg} \theta_{g} + \epsilon_{tg} $$
"sarar": SUR model with spatial lags of the explained variables and spatial errors $$ y_{tg} = \lambda_{g} Wy_{tg} + X_{tg} \beta_{g} + u_{tg} $$ $$ u_{tg} = \rho_{g} W u_{tg} + \epsilon_{tg} $$

This collection of models can be compared, on objective bases, using the LR principle and the following nesting relations:

"sim" vs "sem", where the null hypotheses, in the "sem" equation, are:

$$ H_{0}: \rho_{g}=0 forall g vs H_{A}: \rho_{g} ne 0 exist g$$
"sim" vs "slm", where the null hypotheses, in the "slm" equation, are:

$$ H_{0}: \lambda_{g}=0 forall g vs H_{A}: \lambda_{g} ne 0 exist g$$
"sim" vs "sarar", where the null hypotheses, in the "sarar" equation, are:

$$ H_{0}: \rho_{g}=\lambda_{g}=0 forall g vs H_{A}: \rho_{g} ne 0 or \lambda_{g} ne 0 exist g$$
"sem" vs "sarar", where the null hypotheses, in the "sarar" equation, are:

$$ H_{0}: \lambda_{g}=0 forall g vs H_{A}: \lambda_{g} ne 0 exist g$$
"slm" vs "sarar", where the null hypotheses, in the "sarar" equation, are:

$$ H_{0}: \rho_{g}=0 forall g vs H_{A}: \rho_{g} ne 0 exist g$$
"sem" vs "sdm", also known as LR-COMFAC, where the null hypotheses, in the "sdm" equation, are:

$$ H_{0}: -\lambda_{g}\beta_{g}=\theta_{g} forall g vs H_{A}: -\lambda_{g}\beta_{g} ne \theta_{g} exist g$$

The degrees of freedom of the corresponding $\chi^{2}$ distribution is G in the cases of "sim" vs "sem", "sim" vs "slm", "sem" vs "sarar", "slm" vs "sarar" and "sem" vs "sdm" and 2G in the case of "sim" vs "sarar". Moreover, function lrtestspsur also returns the p-values associated to the corresponding LR.

References

Mur, J., L<U+00F3>pez, F., and Herrera, M. (2010). Testing for spatial effects in seemingly unrelated regressions. Spatial Economic Analysis, 5(4), 399-440.
L<U+00F3>pez, F.A., Mur, J., and Angulo, A. (2014). Spatial model selection strategies in a SUR framework. The case of regional productivity in EU. Annals of Regional Science, 53(1), 197-220.

Examples

Run this code

# NOT RUN {
#################################################
######## CROSS SECTION DATA (nG=1; nT>1) ########
#################################################

#### Example 1: Spatial Phillips-Curve. Anselin (1988, p. 203)
rm(list = ls()) # Clean memory
data("spc")
Tformula <- WAGE83 | WAGE81 ~ UN83 + NMR83 + SMSA | UN80 + NMR80 + SMSA
## It usually requires 1-2 minutes maximum
## LRs <- lrtestspsur(Form = Tformula, data = spc, W = Wspc)

#################################################
######## CROSS SECTION DATA (nG>1; nT=1) ########
#################################################

#### Example 2: Homicides & Socio-Economics (1960-90)
# Homicides and selected socio-economic characteristics for
# continental U.S. counties.
# Data for four decennial census years: 1960, 1970, 1980 and 1990.
# https://geodacenter.github.io/data-and-lab/ncovr/
# }
# NOT RUN {
## It could require some minutes
rm(list = ls()) # Clean memory
data("NCOVR")
Tformula <- HR70 | HR80  | HR90 ~ PS70 + UE70 | PS80 + UE80 + RD80 |
            PS90 + UE90 + RD90 + PO90
LRs <- lrtestspsur(Form = Tformula, data = NCOVR, W = W)
# }
# NOT RUN {
################################################################
######## PANEL DATA: TEMPORAL CORRELATIONS (nG=1; nT>1) ########
################################################################

#### Example 3: Classic panel data
# }
# NOT RUN {
## It could require some minutes
rm(list = ls()) # Clean memory
data(NCOVR)
N <- nrow(NCOVR)
Tm <- 4
index_time <- rep(1:Tm, each = N)
index_indiv <- rep(1:N, Tm)
pHR <- c(NCOVR$HR60, NCOVR$HR70, NCOVR$HR80, NCOVR$HR90)
pPS <- c(NCOVR$PS60, NCOVR$PS70, NCOVR$PS80, NCOVR$PS90)
pUE <- c(NCOVR$UE60, NCOVR$UE70, NCOVR$UE80, NCOVR$UE90)
pNCOVR <- data.frame(indiv = index_indiv, time = index_time, HR = pHR, PS = pPS, UE = pUE)
rm(NCOVR,pHR,pPS,pUE,index_time,index_indiv)
form_pHR <- HR ~ PS + UE
LRs <- lrtestspsur(Form = form_pHR, data = pNCOVR, W = W, time = pNCOVR$time)
# }

Run the code above in your browser using DataLab