Estimate a panel data model subject to a latent group structure using the pairwise adaptive group fused Lasso (PAGFL) by Mehrabani (2023). The PAGFL jointly identifies the group structure and group-specific slope parameters. The function supports both static and dynamic panels, with or without endogenous regressors.
pagfl(
formula,
data,
index = NULL,
n_periods = NULL,
lambda,
method = "PLS",
Z = NULL,
min_group_frac = 0.05,
bias_correc = FALSE,
kappa = 2,
max_iter = 10000,
tol_convergence = 1e-08,
tol_group = 0.001,
rho = 0.07 * log(N * n_periods)/sqrt(N * n_periods),
varrho = max(sqrt(5 * N * n_periods * p)/log(N * n_periods * p) - 7, 1),
verbose = TRUE,
parallel = TRUE,
...
)# S3 method for pagfl
print(x, ...)
# S3 method for pagfl
formula(x, ...)
# S3 method for pagfl
df.residual(object, ...)
# S3 method for pagfl
summary(object, ...)
# S3 method for pagfl
coef(object, ...)
# S3 method for pagfl
residuals(object, ...)
# S3 method for pagfl
fitted(object, ...)
An object of class pagfl holding
modela data.frame containing the dependent and explanatory variables as well as cross-sectional and time indices,
coefficientsa \(\hat{K} \times p\) matrix of the post-Lasso group-specific parameter estimates,
groupsa list containing (i) the total number of groups \(\hat{K}\) and (ii) a vector of estimated group memberships \((\hat{g}_1, \dots, \hat{g}_N)\), where \(\hat{g}_i = k\) if \(i\) is assigned to group \(k\),
residualsa vector of residuals of the demeaned model,
fitteda vector of fitted values of the demeaned model,
argsa list of additional arguments,
ICa list containing (i) the value of the IC, (ii) the employed tuning parameter \(\lambda\), and (iii) the MSE,
convergencea list containing (i) a logical variable indicating if convergence was achieved and (ii) the number of executed ADMM algorithm iterations,
callthe function call.
A pagfl object has print, summary, fitted, residuals, formula, df.residual, and coef S3 methods.
a formula object describing the model to be estimated.
a data.frame or matrix holding a panel data set. If no index variables are provided, the panel must be balanced and ordered in the long format \(\bold{Y}=(Y_1^\prime, \dots, Y_N^\prime)^\prime\), \(Y_i = (Y_{i1}, \dots, Y_{iT})^\prime\) with \(Y_{it} = (y_{it}, \bold{x}_{it}^\prime)^\prime\). Conversely, if data is not ordered or not balanced, data must include two index variables that declare the cross-sectional unit \(i\) and the time period \(t\) of each observation.
a character vector holding two strings. The first string denotes the name of the index variable identifying the cross-sectional unit \(i\) and the second string represents the name of the variable declaring the time period \(t\). The data is automatically sorted according to the variables in index, which may produce errors when the time index is a character variable. In case of a balanced panel data set that is ordered in the long format, index can be left empty if the number of time periods n_periods is supplied.
the number of observed time periods \(T\). If an index character vector is passed, this argument can be left empty. Default is NULL.
the tuning parameter determining the strength of the penalty term. Either a single \(\lambda\) or a vector of candidate values can be passed. If a vector is supplied, a BIC-type IC automatically selects the best fitting \(\lambda\) value.
the estimation method. Options are
"PLS"for using the penalized least squares (PLS) algorithm. We recommend PLS in case of (weakly) exogenous regressors (Mehrabani, 2023, sec. 2.2).
"PGMM"for using the penalized Generalized Method of Moments (PGMM). PGMM is required when instrumenting endogenous regressors, in which case a matrix \(\bold{Z}\) containing the necessary exogenous instruments must be supplied (Mehrabani, 2023, sec. 2.3).
Default is "PLS".
a \(NT \times q\) matrix or data.frame of exogenous instruments, where \(q \geq p\), \(\bold{Z}=(z_1^\prime, \dots, z_N^\prime)^\prime\), \(z_i = (z_{i1}, \dots, z_{iT})^\prime\) and \(z_{it}\) is a \(q \times 1\) vector. Z is only required when method = "PGMM" is selected. When using "PLS", the argument can be left empty or it is disregarded. Default is NULL.
the minimum group cardinality as a fraction of the total number of individuals \(N\). In case a group falls short of this threshold, each of its members is allocated to one of the remaining groups according to the MSE. Default is 0.05.
logical. If TRUE, a Split-panel Jackknife bias correction following Dhaene and Jochmans (2015) is applied to the slope parameters. We recommend using the correction when working with dynamic panels. Default is FALSE.
the a non-negative weight used to obtain the adaptive penalty weights. Default is 2.
the maximum number of iterations for the ADMM estimation algorithm. Default is \(1*10^4\).
the tolerance limit for the stopping criterion of the iterative ADMM estimation algorithm. Default is \(1*10^{-8}\).
the tolerance limit for within-group differences. Two individuals \(i\), \(j\) are assigned to the same group if the Frobenius norm of their coefficient vector difference is below this threshold. Default is \(1*10^{-3}\).
the tuning parameter balancing the fitness and penalty terms in the IC that determines the penalty parameter \(\lambda\). If left unspecified, the heuristic \(\rho = 0.07 \frac{\log(NT)}{\sqrt{NT}}\) of Mehrabani (2023, sec. 6) is used. We recommend the default.
the non-negative Lagrangian ADMM penalty parameter. For PLS, the \(\varrho\) value is trivial. However, for PGMM, small values lead to slow convergence. If left unspecified, the default heuristic \(\varrho = \max(\frac{\sqrt{5NTp}}{\log(NTp)}-7, 1\)) is used.
logical. If TRUE, helpful warning messages are shown. Default is TRUE.
logical. If TRUE, certain operations are parallelized across multiple cores. Default is TRUE.
ellipsis
of class pagfl.
of class pagfl.
Paul Haimerl
Consider the panel data model $$y_{it} = \gamma_i^0 + \bold{\beta}^{0 \prime}_{i} \bold{x}_{it} + \epsilon_{it}, \quad i = 1, \dots, N, \; t = 1, \dots, T,$$ where \(y_{it}\) is the scalar dependent variable, \(\gamma_i^0\) is an individual fixed effect, \(\bold{x}_{it}\) is a \(p \times 1\) vector of weakly exogenous explanatory variables, and \(\epsilon_{it}\) is a zero mean error. The coefficient vector \(\bold{\beta}_i^0\) follows the latent group pattern $$\bold{\beta}_i^0 = \sum_{k = 1}^K \bold{\alpha}_k^0 \bold{1} \{i \in G_k^0 \},$$ with \(\cup_{k = 1}^K G_k^0 = \{1, \dots, N\}\), \(G_k^0 \cap G_j^0 = \emptyset\) and \(\| \bold{\alpha}_k^0 - \bold{\alpha}_j^0 \| \neq 0\) for any \(k \neq j\), \(k,j = 1, \dots, K\).
The PLS method jointly estimates the latent group structure and group-specific coefficients by minimizing the criterion $$Q_{NT} (\bold{\beta}, \lambda) = \frac{1}{T} \sum^N_{i=1} \sum^{T}_{t=1}(\tilde{y}_{it} - \bold{\beta}^\prime_i \tilde{\bold{x}}_{it})^2 + \frac{\lambda}{N} \sum_{i = 1}^{N - 1} \sum_{j=i}^N \dot{\omega}_{ij} \| \bold{\beta}_i - \bold{\beta}_j \|$$ with respect to \(\bold{\beta} = (\bold{\beta}_1^\prime, \dots, \bold{\beta}_N^\prime)^\prime\). \(\tilde{a}_{it} = a_{it} - T^{-1} \sum_{t = 1}^T a_{it}\), \(a = \{y, \bold{x}\}\) to concentrate out the individual fixed effects \(\gamma_i^0\) (within-transformation). \(\lambda\) is the penalty tuning parameter and \(\dot{\omega}_{ij}\) reflects adaptive penalty weights (see Mehrabani, 2023, eq. 2.6). \(\| \cdot \|\) denotes the Frobenius norm. The adaptive weights \(\dot{w}_{ij}\) are obtained by a preliminary individual least squares estimation. The criterion function is minimized via an iterative alternating direction method of multipliers (ADMM) algorithm (see Mehrabani, 2023, sec. 5.1).
PGMM employs a set of instruments Z to control for endogenous regressors. Using PGMM, \(\bold{\beta}\) is estimated by minimizing
$$
Q_{NT}(\bold{\beta}, \lambda) = \sum^N_{i = 1} \left[ \frac{1}{N} \sum_{t=1}^T \bold{z}_{it} (\Delta y_{it} - \bold{\beta}^\prime_i \Delta \bold{x}_{it}) \right]^\prime \bold{W}_i \left[\frac{1}{T} \sum_{t=1}^T \bold{z}_{it}(\Delta y_{it} - \bold{\beta}^\prime_i \Delta \bold{x}_{it}) \right]
$$
$$
\quad + \frac{\lambda}{N} \sum_{i = 1}^{N - 1} \sum_{j=i+1}^N \ddot{\omega}_{ij} \| \bold{\beta}_i - \bold{\beta}_j \|.
$$
\(\ddot{\omega}_{ij}\) are obtained by an initial GMM estimation. \(\Delta\) gives the first differences operator \(\Delta y_{it} = y_{it} - y_{i t-1}\). \(\bold{W}_i\) represents a data-driven \(q \times q\) weight matrix. I refer to Mehrabani (2023, eq. 2.10) for more details.
Again, the criterion function is minimized using an efficient ADMM algorithm (Mehrabani, 2023, sec. 5.2).
Two individuals are assigned to the same group if \(\| \hat{\bold{\beta}}_i - \hat{\bold{\beta}}_j \| \leq \epsilon_{\text{tol}}\) (and hence \(\hat{\bold{\alpha}}_k = \hat{\bold{\beta}}_i = \hat{\bold{\beta}}_j\) for some \(k = 1, \dots, \hat{K}\)), where \(\epsilon_{\text{tol}}\) is determined by tol_group. Subsequently, the estimated number of groups \(\hat{K}\) and group structure follows by examining the number of distinct elements in \(\hat{\bold{\beta}}\). Given an estimated group structure, it is straightforward to obtain post-Lasso estimates using group-wise least squares or GMM (see grouped_plm).
We recommend identifying a suitable \(\lambda\) parameter by passing a logarithmically spaced grid of candidate values with a lower limit close to 0 and an upper limit that leads to a fully homogeneous panel. A BIC-type information criterion then selects the best fitting \(\lambda\) value.
Dhaene, G., & Jochmans, K. (2015). Split-panel jackknife estimation of fixed-effect models. The Review of Economic Studies, 82(3), 991-1030. tools:::Rd_expr_doi("10.1093/restud/rdv007").
Mehrabani, A. (2023). Estimation and identification of latent group structures in panel data. Journal of Econometrics, 235(2), 1464-1482. tools:::Rd_expr_doi("10.1016/j.jeconom.2022.12.002").
# Simulate a panel with a group structure
set.seed(1)
sim <- sim_DGP(N = 20, n_periods = 80, p = 2, n_groups = 3)
y <- sim$y
X <- sim$X
df <- cbind(y = c(y), X)
# Run the PAGFL procedure
estim <- pagfl(y ~ ., data = df, n_periods = 80, lambda = 0.5, method = "PLS")
summary(estim)
# Lets pass a panel data set with explicit cross-sectional and time indicators
i_index <- rep(1:20, each = 80)
t_index <- rep(1:80, 20)
df <- data.frame(y = c(y), X, i_index = i_index, t_index = t_index)
estim <- pagfl(
y ~ .,
data = df, index = c("i_index", "t_index"), lambda = 0.5, method = "PLS"
)
summary(estim)
Run the code above in your browser using DataLab