panelAR (version 0.1)

panelAR: Estimation of Linear AR(1) Panel Data Models with Cross-Sectional Heteroskedasticity and/or Correlation

Description

The function estimates linear models on panel data structures in the presence of AR(1)-type autocorrelation as well as panel heteroskedasticity and/or contemporaneous correlation. First, AR(1)-type autocorrelation is addressed via a two-step Prais-Winsten feasible generalized least squares (FGLS) procedure, where the autocorrelation coefficients may be panel-specific. Subsequently, one can choose to implement ‘sandwich’-type robust standard errors with OLS, panel weighted least squares (WLS), panel-corrected standard errors (PCSEs), or the Parks-Kme4nta FGLS estimator.

Usage

panelAR(formula, data, panelVar, timeVar, autoCorr = c("ar1", "none", "psar1"), panelCorrMethod = c("none","phet","pcse","pwls", "parks"), rhotype ="breg", bound.rho = FALSE, rho.na.rm = FALSE, panel.weight = c("t-1", "t"), dof.correction = FALSE, complete.case = FALSE, seq.times = FALSE, singular.ok=TRUE)

Arguments

formula
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.
data
a data frame containing the variables in the model, as well as a variables defining the units and time.
panelVar
the column name of data that contains the panel ID. It cannot contain any NAs. May be set to NULL, in which case all observations are assumed to belong to the same unit.
timeVar
the column of data that contains the time ID. It must be a vector of integers and cannot contain any NAs. Duplicate time observations per panel are not allowed. At least two time periods are required.
autoCorr
character string denoting structure of autocorrelation in the data: ar1 denotes AR(1)-type autocorrelation with a common correlation coefficient across all panels, psar1 denotes AR(1)-type autocorrelation with a unique correlation coefficient for each panel, and none denotes no autocorrelation. Default: ar1.
panelCorrMethod
character string denoting method used for dealing with panel heteroskedasticity and/or correlation. none denotes homoskedasticity and no correlation across panels, phet denotes a Huber-White style sandwich estimator for panel heteroskedasticity, pcse denotes panel-corrected standard errors that are robust to both heteroskedasticity and contemporaneous correlation across panels, pwls denotes that a panel weighted least squares procedure is to deal with panel heteroskedasticity, and parks means that Parks-Kmenta FGLS is used to estimate both panel heteroskedasticity and correlation. Default: none.
rhotype
character string denoting method used for estimating autocorrelation coefficient, $\rho$. Possible options are breg, scorr, freg, theil, dw, and theil-nagar. See ‘Details’. Default: breg.
bound.rho
logical. If TRUE, the panel-specific autocorrelation coefficient $\rho_i$ is bounded to $[-1,1]$ in the calculation of $\rho$; used only for autoCorr="ar1". Default: TRUE.
rho.na.rm
logical. If FALSE and $\rho_i$ cannot be calculated for a panel, function returns error. If TRUE, $\rho_i$s that are NA are ignored if calculating a common AR(1) coefficient or set to 0 if calculating panel-specific AR(1) coefficients. Default: FALSE.
panel.weight
the weight to be used for each panel when combining panel-specific autocorrelations $\rho_i$ to a common $\rho$. Weight is either the number of time periods in the corresponding panel (t) or the number of time periods minus 1 (t-1). Default: t.
dof.correction
logical. If TRUE, standard errors are adjusted by a factor of $N/(N-k)$, where $N$ is total number of observations and k is the rank of the linear model. Default: FALSE.
complete.case
logical. If TRUE, use only the time periods where every panel has a valid observation in the estimation of PCSEs or the Parks-Kmenta estimator. Otherwise, use pairwise procedure. Default: FALSE.
seq.times
logical. If TRUE, observations are temporally ordered by panel and assigned a sequential time variable that ignores any gaps in the runs. Default: FALSE.
singular.ok
logical. If FALSE, a singular failure results in an error. Default: TRUE.

Value

panelAR returns an object of class "panelAR".The function summary can be used to obtain and print a summary of the results. Note that default methods coefficients, fitted.values, and residuals returns vectors of regression coefficients, fitted values, and residuals, respectively. vcov returns the estimated variance-covariance matrix of the coefficients.An object of class "panelAR" contains the following components, very similar to the outputs of the standard lm function:
coefficients
the named vector of coefficients.
residuals
the residuals.
fitted.values
the fitted mean values.
rank
the numeric rank of the fitted linear model.
df.residual
the residual degrees of freedom.
call
the matched call.
terms
the terms object used.
model
the model frame used.
aliased
named logical vector designating if original coefficients are aliased.
na.action
information returned by model.frame in the handling of NAs.
vcov
estimated variance-covariance matrix of coefficients.
r2
$R^2$ based on quasi-differenced data from the Prais-Winsten regression. Set to NULL if PWLS or Parks-Kmenta procedures are used.
panelStructure
a list of several objects which contain information on the panel structure of the data. See details below.
Details of panelStructure:
obs.mat
logical matrix of dimension $N_p \times T$, where $N_p$ is the number of panels. If cell value is TRUE, panel $i$ at time $t$ has a valid observation. Panel structure is balanced if entire matrix is TRUE.
rho
autocorrelation parameters. Scalar if "ar1" option was used, vector of length $N_p$ (number of panels) if "psar1" option was used, and NULL if "none" option was used.
Sigma
$N_p \times N_p$ matrix of estimated panel covariances.
N.cov
number of panel covariances estimated.

Details

Function for running two-step Prais-Winsten models on panel data that exhibit AR(1)-type autocorrelation. Following the two-step estimation, one can choose to use a ‘sandwich’-type robust standard error estimator with OLS or a panel weighted least squares estimator to address panel heteroskedasticity. Alternatively, if panels are both heteroskedastic and contemporaneously correlated, the package supports panel-corrected standard errors (PCSEs) as well as the Parks-Kmenta FGLS estimator. Note that the Parks-Kmenta estimator should ideally be reserved for use only when the number of time periods is significantly greater than the number of panels (see Beck and Katz). The function is robust to unbalanced panel structures, panels with just one observation, multiple runs per panel, and the presence of panels without any overlapping observations.

While generally designed to estimate Prais-Winsten models on panel data, setting panelVar to NULL will estimate an AR(1) time-series model treating the entire dataset as one unit. In this case, the panelCorrMethod is ignored since equal variances are assumed across all observations.

A number of common estimators for the autocorrelation coefficient are supported. Specifically:

breg
Linear regression estimator: $\hat{\rho}_{breg} = \frac{\sum_{t=2}^{T_i} \hat{\epsilon}_{i,t}\hat{\epsilon}_{i,t-1}}{\sum_{t=1}^{T_i-1} \hat{\epsilon}_{i,t}^2}$

scorr
Sample correlation coefficient estimator: $\hat{\rho}_{scorr} = \frac{\sum_{t=2}^{T_i} \hat{\epsilon}_{i,t}\hat{\epsilon}_{i,t-1}}{\sum_{t=1}^{T_i} \hat{\epsilon}_{i,t}^2}$

freg
Forward linear regression estimator: $\hat{\rho}_{freg} = \frac{\sum_{t=1}^{T_i-1} \hat{\epsilon}_{i,t}\hat{\epsilon}_{i,t+1}}{\sum_{t=1}^{T_i-1} \hat{\epsilon}_{i,t+1}^2}$

theil
Theil estimator: $\hat{\rho}_{theil} = \hat{\rho}_{scorr} \frac{T_i-k}{T_i-1}$

dw
Durbin-Watson estimator: $\hat{\rho}_{dw} = 1-\frac{1}{2} \frac{\sum_{t=2}^{T_i} (\hat{\epsilon}_{i,t}-\hat{\epsilon}_{i,t-1})^2}{\sum_{t=1}^{T_i} \hat{\epsilon}_{i,t}^2}$

theil-nagar
Theil-Nagar estimator: $\hat{\rho}_{theil-nagar} = \frac{T_i^2 \hat{\rho}_{dw} + k^2}{T_i^2-k^2}$

In the expressions above, $\hat{\epsilon}$ denotes observed residuals from the first stage OLS regression, $T_i$ is the number of observations in panel $i$, and $k$ is the rank of the model matrix. Some of these estimators cannot be calculated for panels with one observation or multiple runs of one observation. In these cases, rho.na.rm controls the treatment of these autocorrelation coefficients. If TRUE, ignore panel-specific autocorrelation coefficients for panels where $\rho_i$ returns NA if calculating a common AR(1) coefficient, and set them to 0 if calculating panel-specific AR(1) coefficients.

If PCSEs or the Parks-Kmenta estimator are selected, the default is to use all pairwise observations to estimate the time-constant covariances across units. In the case of no overlapping observations between panels, the panel covariance is assumed to be 0. If complete.case is set to TRUE, then only the time periods where every panel has a valid observation are used for the calculation of the contemporaneous correlation matrix.

References

Beck, Nathaniel and Jonathan N. Katz. 1995. “What to do (and not to do) with time-series cross-section data.” Am. Polit. Sci. Rev. 89:634-47.

Greene, William H. 2012. Econometric Analysis. 7ed. Prentice Hall.

Judge, George G., William E. Griffiths, R. Carter Hill, Helmut Lütkepohl, and Tsoung-Chao Lee. 1985. The Theory and Practice of Econometrics. 2ed. John Wiley & Sons.

Prais, S., and C. Winsten. 1954. “Trend Estimation and Serial Correlation.” Cowles Commission Discussion Paper No. 383, Chicago.

See Also

summary.panelAR for summary.

predict.panelAR for prediction.

plot.panelAR to plot image of panel structure.

run.analysis for analysis of runs.

Examples

Run this code
# Common AR(1) with PCSE
data(Rehm)
out <- panelAR(NURR ~ gini + mean_ur + selfemp + cum_right + tradeunion + deficit + 
tradeopen + gdp_growth, data=Rehm, panelVar='ccode', timeVar='year', autoCorr='ar1', 
panelCorrMethod='pcse', rho.na.rm=TRUE, panel.weight='t-1', bound.rho=TRUE)
summary(out)

# Panel-specific AR(1) with PCSE
data(WhittenWilliams)
# expect warning urging to use 'complete.case=FALSE' 
out2 <- panelAR(milex_gdp~lag_milex_gdp+GOV_rl+gthreat+GOV_min+GOV_npty+election_yr+
lag_real_GDP_gr+cinclag+lag_alliance+lag_cinc_ratio+lag_us_change_milex_gdp, 
data=WhittenWilliams, panelVar="ccode", timeVar="year", autoCorr="psar1", 
panelCorrMethod="pcse", complete.case=TRUE) 
summary(out2)
summary(out2)$rho

# Panel-specific AR(1) correlation with PWLS	
data(BrooksKurtz)
out3 <- panelAR(kaopen ~ ldiffpeer + ldiffisi + ldiffgrowth + ldiffinflation + 
ldiffneg + ldiffembi + limf + isi_objective + partisan + checks +  lusffr + 
linflation + lbankra + lcab + lgrowth +  ltradebalance + lngdpcap + lngdp + 
brk + timetrend + y1995, data=BrooksKurtz, panelVar='country', timeVar='year', 
autoCorr='psar1', panelCorrMethod='pwls',rho.na.rm=TRUE, panel.weight='t', 
seq.times=TRUE)
summary(out3)

Run the code above in your browser using DataLab