felm(formula, data, exactDOF = FALSE, subset, na.action, contrasts = NULL, weights = NULL, ...)
felmto attempt to compute it, but this may fail if there are too many levels in the factors.
exactDOF='rM'will use the exact method in
Matrix::rankMatrix(), but this is slower. If neither of these methods works, it is possible to specify
exactDOF='mc', which utilizes a Monte-Carlo method to estimate the expectation E(x' P x) = tr(P), the trace of a certain projection, a method which may be more accurate than the default guess.
If the degrees of freedom for some reason are known, they can be specified
NAs. The default is set by the
options, and is
na.failif that is unset. The 'factory-fresh' default is
na.omit. Another possible value is
NULL, no action.
na.excludeis currently not supported.
weights(that is, minimizing
sum(w*e^2)); otherwise ordinary least squares is used.
keepXlogical. To include a copy of the expanded data matrix in the return value, as needed by
fevcovfor proper limited mobility bias correction.
keepCXlogical. Keep a copy of the centred expanded data matrix in the return value. As list elements
cXfor the explanatory variables, and
cYfor the outcome.
nostatslogical. Don't include covariance matrices in the output, just the estimated coefficients and various descriptive information. For IV,
nostatscan be a logical vector of length 2, with the last value being used for the 1st stages.
psdeflogical. In case of multiway clustering, the method of Cameron, Gelbach and Miller may yield a non-definite variance matrix. Ordinarily this is forced to be semidefinite by setting negative eigenvalues to zero. Setting
psdef=FALSEwill switch off this adjustment. Since the variance estimator is asymptotically correct, this should only have an effect when the clustering factors have very few levels.
kclasscharacter. For use with instrumental variables. Use a k-class estimator rather than 2SLS/IV. Currently, the values
'nagar', 'b2sls', 'mb2sls', 'liml'are accepted, where the names are from Kolesar et al (2014), as well as a numeric value for the 'k' in k-class. With
felmalso accepts the argument
fuller=, for using a Fuller adjustment of the liml-estimator.
Nboot, bootexpr, bootclusterSince
felmhas quite a bit of overhead in the creation of the model matrix, if one wants confidence intervals for some function of the estimated parameters, it is possible to bootstrap internally in
felm. That is, the model matrix is resampled
Nboottimes and estimated, and the
bootexpris evaluated inside an
sapply. The estimated coefficients and the left hand side(s) are available by name. Any right hand side variable
xis available by the name
"felm"-object for each estimation is available as
est. If a
bootclusteris specified as a factor, entire levels are resampled.
bootclustercan also be a function with no arguments, it should return a vector of integers, the rows to use in the sample. It can also be the string 'model', in which case the cluster is taken from the model.
bootexprshould be an expression, e.g. like
quote(x/x2 * abs(x3)/mean(y)). It could be wise to specify
nostats=TRUEwhen bootstrapping, unless the covariance matrices are needed in the bootstrap. If you need the covariance matrices in the full estimate, but not in the bootstrap, you can specify it in an attribute
iv, clustervardeprecated. These arguments will be removed at a later time, but are still supported in this field. Users are STRONGLY encouraged to use multipart formulas instead. In particular, not all functionality is supported with the deprecated syntax; iv-estimations actually run a lot faster if multipart formulas are used, due to new algorithms which I didn't bother to shoehorn in place for the deprecated syntax.
felmreturns an object of
"felm". It is quite similar to an
"lm"object, but not entirely compatible.The generic
summary-method will yield a summary which may be
'lm'object, and some postprocessing methods designed for
lmmay happen to work. It may however be necessary to coerce the object to succeed with this.The
"felm"object is a list containing the following fields:
felm' objects for the IV 1st stage, if used. The 1st stage has multiple left hand sides if there are more than one instrumented variable.
felm(keepX=TRUE)is specified. Must be included if
fevcovis to be used for correcting limited mobility bias.
replicateapplied to the
The formula specification is a response variable followed by a four part
formula. The first part consists of ordinary covariates, the second part
consists of factors to be projected out. The third part is an
IV-specification. The fourth part is a cluster specification for the
standard errors. I.e. something like
y ~ x1 + x2 | f1 + f2 | (Q|W ~
x3+x4) | clu1 + clu2 where
y is the response,
f1,f2 are factors to be projected out,
W are covariates which are instrumented by
clu1,clu2 are factors to be used for computing cluster
robust standard errors. Parts that are not used should be specified as
0, except if it's at the end of the formula, where they can be
omitted. The parentheses are needed in the third part since
higher precedence than
~. Multiple left hand sides like
x1 + x2 |f1+f2|... are allowed.
Interactions between a covariate
x and a factor
f can be
projected out with the syntax
x:f. The terms in the second and
fourth parts are not treated as ordinary formulas, in particular it is not
possible with things like
y ~ x1 | x*f, rather one would specify
y ~ x1 + x | x:f + f. Note that
f:x also works, since R's
parser does not keep the order. This means that in interactions, the factor
must be a factor, whereas a non-interacted factor will be coerced to
a factor. I.e. in
y ~ x1 | x:f1 + f2, the
f1 must be a factor,
whereas it will work as expected if
f2 is an integer vector.
In older versions of lfe the syntax was
felm(y ~ x1 + x2 + G(f1)
+ G(f2), iv=list(Q ~ x3+x4, W ~ x3+x4), clustervar=c('clu1','clu2')). This
syntax still works, but yields a warning. Users are strongly
encouraged to change to the new multipart formula syntax. The old syntax
will be removed at a later time.
The standard errors are adjusted for the reduced degrees of freedom coming
from the dummies which are implicitly present. In the case of two factors,
the exact number of implicit dummies is easy to compute. If there are more
factors, the number of dummies is estimated by assuming there's one
reference-level for each factor, this may be a slight over-estimation,
leading to slightly too large standard errors. Setting
computes the exact degrees of freedom with
rankMatrix() in package
For the iv-part of the formula, it is only necessary to include the
instruments on the right hand side. The other explanatory covariates, from
the first and second part of
formula, are added automatically in the
first stage regression. See the examples.
contrasts argument is similar to the one in
lm(), it is
used for factors in the first part of the formula. The factors in the second
part are analyzed as part of a possible subsequent
The old syntax with a single part formula with the
G() syntax for the
factors to transform away is still supported, as well as the
iv arguments, but users are encouraged to move
to the new multi part formulas as described here. The
iv arguments have been moved to the
... argument list. They
will be removed in some future update.
Kolesar, M., R. Chetty, J. Friedman, E. Glaeser, and G.W. Imbens (2014) Identification and Inference with Many Invalid Instruments, Journal of Business & Economic Statistics (to appear). http://dx.doi.org/10.1080/07350015.2014.978175
oldopts <- options(lfe.threads=1) ## create covariates x <- rnorm(1000) x2 <- rnorm(length(x)) ## individual and firm id <- factor(sample(20,length(x),replace=TRUE)) firm <- factor(sample(13,length(x),replace=TRUE)) ## effects for them id.eff <- rnorm(nlevels(id)) firm.eff <- rnorm(nlevels(firm)) ## left hand side u <- rnorm(length(x)) y <- x + 0.5*x2 + id.eff[id] + firm.eff[firm] + u ## estimate and print result est <- felm(y ~ x+x2| id + firm) summary(est) ## Not run: # ## compare with lm # summary(lm(y ~ x + x2 + id + firm-1)) # ## End(Not run) # make an example with 'reverse causation' # Q and W are instrumented by x3 and the factor x4. Report robust s.e. x3 <- rnorm(length(x)) x4 <- sample(12,length(x),replace=TRUE) Q <- 0.3*x3 + x + 0.2*x2 + id.eff[id] + 0.3*log(x4) - 0.3*y + rnorm(length(x),sd=0.3) W <- 0.7*x3 - 2*x + 0.1*x2 - 0.7*id.eff[id] + 0.8*cos(x4) - 0.2*y+ rnorm(length(x),sd=0.6) # add them to the outcome y <- y + Q + W ivest <- felm(y ~ x + x2 | id+firm | (Q|W ~x3+factor(x4))) summary(ivest,robust=TRUE) condfstat(ivest) ## Not run: # # compare with the not instrumented fit: # summary(felm(y ~ x + x2 +Q + W |id+firm)) # ## End(Not run) options(oldopts)
Run the code above in your browser using DataCamp Workspace