Use integration of the joint distribution of the coefficients to compute the expectation of some function of the coefficients. Can be used for non-linear inference tests.
nlexpect(
est,
fun,
coefs,
...,
tol = getOption("lfe.etol"),
lhs = NULL,
cv,
istats = FALSE,
flags = list(verbose = 0),
max.eval = 200000L,
method = c("hcubature", "pcubature", "cuhre", "suave", "vegas", "divonne"),
vectorize = FALSE
)
object of class "felm"
or "lm"
, a result of a call to
felm
or lm
.
function of coefficients to be integrated. Can also be a
quote
d expression.
character. Names of coefficients to test. Only needed if
fun
is a function.
other arguments passed to fun or the integration routine.
numeric. Tolerance for the computed integral.
character. Name of the left hand side, if est
has more
than one.
Covariance matrix to use in place of vcov(est)
logical. Should convergence information from the integration routine be included as attributes?
list. Additional flags for the underlying integration routine. Not used after the package R2Cuba disappeared.
integer. Maximum number of integral evaluations.
character. A method specification usable by cubature::cubintegrate
.
The documentation there says that "pcubature"
is good for smooth integrands of low dimensions.
logical or numeric. Use vectorized function evaluation from package
cubature. This can speed up integration significantly. If method is from the Cuba library
(i.e. not pcubature or hcubature), vectorize
should be specified as a numeric, a vectorization
factor. The default is 128.
The function nlexpect
computes and returns the expectation of
the function fun(beta)
, with beta
a vector of coefficients.
I.e., if the coefficients beta
are bootstrapped a large number of
times, nlexpect(est, fun)
should be equal to mean(fun(beta))
.
The function nlexpect
integrates the function fun(x)
over the
multivariate normal distribution specified by the point estimates and the
covariance matrix vcov(est)
. This is the expectation of
fun(beta)
if we were to bootstrap the data (e.g. by drawing the
residuals anew) and do repeated estimations.
The list of coefficients used by fun
must be specified in
coefs
.
If the function is simple, it can be specified as a quoted expression like
quote(a*b+log(abs(d)))
. In this case, if coefs
is not
specified, it will be set to the list of all the variables occuring in the
expression which are also names of coefficients.
fun
may return a vector of values, in which case a vector of
expectations is computed, like quote(c(a*b, a^3-b))
. However, if the
expressions contain different variables, like quote(c(a*b, d*e))
, a
quite compute intensive 4-dimensional integral will be computed, compared to
two cheap 2-dimensional integrals if you do them separately. There is nothing to gain
from using vector-valued functions compared to multiple calls to nlexpect()
.
You may of course also integrate inequalites like quote(abs(x1-0.2) >
0.2)
to simulate the probability from t-tests or Wald-tests. See the
examples.
The function you provide will get an argument ...
if it does not have
one already. It will also be passed an argument .z
which contains
the actual coefficients in normalized coordinates, i.e. if ch
is the
Cholesky decomposition of the covariance matrix, and pt
are the point
estimates, the coefficients will be pt + ch %*% .z
. The first argument
is a vector with names corresponding to the coefficients.
If you specify vectorized=TRUE
, your function will be passed a list with vectors
in its first argument. The function must
be able to handle a list, and must return a vector of the same length as the vectors
in the list. If you pass an expression like x < y
, each variable will be a vector.
If your function is vector valued, it must return a matrix where each
column is the values.
The tol
argument specifies both the relative tolerance and the
absolute tolerance. If these should not be the same, specify tol
as a
vector of length 2. The first value is the relative tolerance, the second is
the absolute tolerance. Termination occurs when at least one of the
tolerances is met.
The ...
can be used for passing other arguments to the integration
routine cubature::cubintegrate
and the function to be integrated.
# NOT RUN {
N <- 100
x1 <- rnorm(N)
# make some correlation
x2 <- 0.1*rnorm(N) + 0.1*x1
y <- 0.1*x1 + x2 + rnorm(N)
summary(est <- felm(y ~ x1 + x2))
pt1 <- coef(est)['x1']
pt2 <- coef(est)['x2']
# expected values of coefficients, should match the summary
# and variance, i.e. square of standard errors in the summary
nlexpect(est, quote(c(x1=x1,x2=x2,var=c((x1-pt1)^2,(x2-pt2)^2))))
# }
# NOT RUN {
# the covariance matrix:
nlexpect(est, tcrossprod(as.matrix(c(x1-pt1,x2-pt2))))
# }
# NOT RUN {
#Wald test of single variable
waldtest(est, ~x1)['p.F']
# the same with nlexpect, i.e. probability for observing abs(x1)>abs(pt1) conditional
# on E(x1) = 0.
nlexpect(est, (x1-pt1)^2 > pt1^2, tol=1e-7, vectorize=TRUE)
# which is the same as
2*nlexpect(est, x1*sign(pt1) < 0)
# Here's a multivalued, vectorized example
nlexpect(est, rbind(a=x1*x2 < pt1, b=x1*x2 > 0), vectorize=TRUE, method='divonne')
# }
# NOT RUN {
# Non-linear test:
# A simple one, what's the probability that product x1*x2 is between 0 and |E(x1)|?
nlexpect(est, x1*x2 > 0 & x1*x2 < abs(pt1), vectorize=TRUE, method='divonne')
# Then a more complicated one with the expected value of a polynomal in the coefficients
f <- function(x) c(poly=x[['x1']]*(6*x[['x1']]-x[['x2']]^2))
# This is the linearized test:
waldtest(est, f)['p.F']
# In general, for a function f, the non-linear Wald test is something like
# the following:
# expected value of function
Ef <- nlexpect(est, f, coefs=c('x1','x2'))
# point value of function
Pf <- f(c(pt1,pt2))
# similar to a Wald test, but non-linear:
nlexpect(est, function(x) (f(x)-Ef)^2 > Pf^2, c('x1','x2'), vectorize=TRUE)
# one-sided
nlexpect(est, function(x) f(x)-Ef > abs(Pf), c('x1','x2'), vectorize=TRUE)
# other sided
nlexpect(est, function(x) f(x)-Ef < -abs(Pf), c('x1','x2'), vectorize=TRUE)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab