feglm
can be used to fit generalized linear models
with many high-dimensional fixed effects. The estimation procedure is based
on unconditional maximum likelihood and can be interpreted as a
“weighted demeaning” approach.
Remark: The term fixed effect is used in econometrician's sense of having intercepts for each level in each category.
feglm(
formula = NULL,
data = NULL,
family = gaussian(),
weights = NULL,
beta_start = NULL,
eta_start = NULL,
control = NULL
)
A named list of class "feglm"
. The list contains the following
fifteen elements:
a named vector of the estimated coefficients
a vector of the linear predictor
a vector of the weights used in the estimation
a matrix with the numerical second derivatives
the deviance of the model
the null deviance of the model
a logical indicating whether the model converged
the number of iterations needed to converge
a named vector with the number of observations used in the estimation indicating the dropped and perfectly predicted observations
a named vector with the number of levels in each fixed effects
a list with the names of the fixed effects variables
the formula used in the model
the data used in the model after dropping non-contributing observations
the family used in the model
the control list used in the model
an object of class "formula"
: a symbolic description of
the model to be fitted. formula
must be of type y ~ X | k
,
where the second part of the formula refers to factors to be concentrated
out. It is also possible to pass clustering variables to feglm
as y ~ X | k | c
.
an object of class "data.frame"
containing the variables
in the model. The expected input is a dataset with the variables specified
in formula
and a number of rows at least equal to the number of
variables in the model.
the link function to be used in the model. Similar to
glm.fit
this has to be the result of a call to a family
function. Default is gaussian()
. See family
for
details of family functions.
an optional string with the name of the 'prior weights'
variable in data
.
an optional vector of starting values for the structural parameters in the linear predictor. Default is \(\boldsymbol{\beta} = \mathbf{0}\).
an optional vector of starting values for the linear predictor.
a named list of parameters for controlling the fitting
process. See fit_control
for details.
If feglm
does not converge this is often a sign of
linear dependence between one or more regressors and a fixed effects
category. In this case, you should carefully inspect your model
specification.
Gaure, S. (2013). "OLS with Multiple High Dimensional Category Variables". Computational Statistics and Data Analysis, 66.
Marschner, I. (2011). "glm2: Fitting generalized linear models with convergence problems". The R Journal, 3(2).
Stammann, A., F. Heiss, and D. McFadden (2016). "Estimating Fixed Effects Logit Models with Large Panel Data". Working paper.
Stammann, A. (2018). "Fast and Feasible Estimation of Generalized Linear Models with High-Dimensional k-Way Fixed Effects". ArXiv e-prints.
mod <- feglm(mpg ~ wt | cyl, mtcars, family = poisson(link = "log"))
summary(mod)
mod <- feglm(mpg ~ wt | cyl | am, mtcars, family = poisson(link = "log"))
summary(mod, type = "clustered")
Run the code above in your browser using DataLab