Estimates spatially-clustered spatial regression (SCSR) models, such as the spatially-clustered linear regression model (SCLM), the spatially-clustered spatial autoregressive model (SCSAR), the spatially-clustered spatial durbin model (SCSEM), and the spatially-clustered linear regression model with spatially-lagged exogenous covariates and response variable (SCSLX). Estimation is performed via cluster-wise maximum likelihood as presented in <https://arxiv.org/abs/2407.15874>.
SCSR_Estim(
Formula,
Data_sf,
listW,
G = 2,
Phi = 1,
Type = c("SCLM", "SCSAR", "SCSEM", "SCSLX"),
CenterVars = FALSE,
ScaleVars = FALSE,
Maxitr = 100,
RelTol = 10^-6,
AbsTol = 10^-5,
Verbose = TRUE,
Seed = 123456789
)
A list object containing the following outputs:
ClusterFitModels: G-dimensional list containing the estimated clustered regression models of class lm
or Sarlm
Beta: (G x p) matrix of cluster-wise or pooled regression coefficients
Sig: G-dimensional vector of cluster-wise standard deviations
VCov: (p x p x G) array of cluster-wise variance-covariance matrices of coefficients
W_g: G-dimensional list containing for the g-th cluster with cardinality n_g a (n_g x n_g) spatial weighting matrix
listW_g: G-dimensional list containing for the g-th cluster the weights list
Group: (n x 1) vector of group assignment
sBeta: (n x p) matrix of location-wise regression coefficients
sSig: (n x 1) vector of location-wise standard deviations
MLE: Estimated maximum log-likelihood
Iter: The number of iteration needed to satisfy the convergence criterion and end up the clustering iterative loop
a symbolic description of the regression model to be fit. The details of model specification are given for lm(...)
A data.frame
object of class sf
with n rows (each one corresponding to a location/polygon) and a user-defined number of columns.
The data frame must contain the response variable and all the covariates to be used in the model. Also, it must include the geometry
feature for spatial modelling and representation.
Typically, sf
data.frame
are built using the st_as_sf(...)
command from the sf
package (see its documentation for details).
listw
object. It contains the spatial weights for the spatial autoregressive component.
Typically, listW is built using the nb2listw(...)
command from the spdep
package (see its documentation for details).
We suggest to adopt one of matrix styles suggested in the spdep
package, such as W
(row-standardized) or B
(binary).
We also suggest to adopt a zero.policy = TRUE
option to allow the computation of groups/clusters with isolated units. In this regard, we recall that if zero.policy = FALSE
and Type = "SCSAR"
causes SCSR_Estim(...)
to terminate with an error.
See package spatialreg
for details on the zero.policy
input.
Integer value. Number of clusters to be considered. When 'G=1', the pooled regression (no clusterwise) is estimated. Default is 'G = 2'.
Non-negative (>=0) real value. Spatial penalty parameter. Default is 'Phi = 1'.
Character. Declares which model specification has to be estimated. Admitted strings are:
"SCLM"
for linear regression model without spatial effects (LM);
"SCSAR"
for spatial autoregressive (SAR) model;
"SCSEM"
for linear regression model with spatial autoregressive error term or spatial Durbin model (SEM);
"SCSLX"
for linear regression model with spatially-lagged response variable and covariates (SLX);
Logical
value (TRUE
or FALSE
) stating whether the response variable and the covariates have to be centered around the mean in the iterative algorithm to update memberships and group-wise parameters.
Centering is only use in the iterative procedure, while final estimates provided to the user are computed original (i.e., non-centered) variables.
Logical
value (TRUE
or FALSE
) stating whether the response variable and the covariates have to be scaled with respect to their standard deviation in the iterative algorithm to update memberships and group-wise parameters.
Scaling is only used in the iterative procedure, while final estimates provided to the user are computed original (i.e., non-scaled) variables.
Integer value. Maximum number of iterations for the iterative algorithm. Convergence criterion is fixed to \(\varepsilon\) = 10^(-5).
Tolerance for the relative improvement in the log-likelihood (exit criterion) from iteration k to k+1. Default is \(\varepsilon_{Rel}\) = 10^-6
Tolerance for the absolute improvement in the log-likelihood (exit criterion) from iteration k to k+1. Default is \(\varepsilon_{Abs}\) = 10^-5
Logical
value (TRUE
or FALSE
). Toggle warnings and messages. If verbose = TRUE
(default) the function
prints on the screen some messages describing the progress of the tasks. If verbose = FALSE
any message about the progression is suppressed.
Integer value. Define the random number generator (RNG) state for random number generation in R.
Deafult is seed = 123456789
.
The package SCSR
computes the spatially-clustered spatial regression models based on the spatialreg
package (see <https://cran.r-project.org/web/packages/spatialreg/index.html>).
SCSAR model is estimated using the function lagsarlm
; SCSEM model is estimated using the function errorsarlm
; SCSLX model is estimated using the function lmSLX
.
SCLM model is estimated using the lm
function from package stats
.
Thus, estimated SCSAR, SCSEM and SCSLX models belong to class Sarlm
, while estimated SCLM belongs to class lm
.
We kindly refer to the package spatialreg
for any detail regarding computational aspects (e.g., optimization).
Also, we refer to the package spdep
for computational details on the spatial weighting matrix via listw2mat(...)
, nb2listw(...)
and nb2mat(...)
from the spdep
package.
For computional details on the spatially-clustered models, we kindly refer to Cerqueti, R., Maranzano, P. & Mattera, R. "Spatially-clustered spatial autoregressive models with application to agricultural market concentration in Europe". arXiv preprints (<doi:10.48550/arXiv.2407.15874>)
data(Data_RC_PM_RM_JABES2024, package="SCDA")
SCSAR <- SCSR_Estim(Formula = "Gini_SO ~ GDPPC_PPS2020 + Share_AgroEmp",
Data_sf = Data2020, G=3, listW=listW, Type="SCSAR", Phi = 0.50)
SCLM <- SCSR_Estim(Formula = "Gini_SO ~ GDPPC_PPS2020 + Share_AgroEmp",
Data_sf = Data2020, G=3, listW=listW, Type="SCLM", Phi = 0.50)
Run the code above in your browser using DataLab