This function provides a general framework for using the marginal treatment effect (MTE) to extrapolate. The model is the same binary treatment instrumental variable (IV) model considered by Imbens and Angrist (1994) and Heckman and Vytlacil (2005). The framework on which this function is based was developed by Mogstad, Santos and Torgovitsky (2018). See also the recent survey paper on extrapolation in IV models by Mogstad and Torgovitsky (2018). A detailed description of the module and its features can be found in Shea and Torgovitsky (2019).
ivmte(data, target, late.from, late.to, late.X, genlate.lb, genlate.ub,
target.weight0 = NULL, target.weight1 = NULL, target.knots0 = NULL,
target.knots1 = NULL, m0, m1, uname = u, m1.ub, m0.ub, m1.lb, m0.lb,
mte.ub, mte.lb, m0.dec, m0.inc, m1.dec, m1.inc, mte.dec, mte.inc, ivlike,
components, subset, propensity, link = "logit", treat,
lpsolver = NULL, criterion.tol = 0, initgrid.nx = 20,
initgrid.nu = 20, audit.nx = 2500, audit.nu = 25,
audit.add = 100, audit.max = 25, point = FALSE,
point.eyeweight = FALSE, bootstraps = 0, bootstraps.m,
bootstraps.replace = TRUE, levels = c(0.99, 0.95, 0.9),
ci.type = "backward", specification.test = TRUE, noisy = TRUE,
smallreturnlist = FALSE, seed = 12345, debug = FALSE)
data.frame
or data.table
used to estimate
the treatment effects.
character, target parameter to be
estimated. Currently function allows for ATE ('ate'
),
ATT ('att'
), ATU ('atu'
), LATE ('late'
),
and generalized LATE ('genlate'
).
a named vector, or a list, declaring the baseline set of values of Z used to define the LATE. The name associated with each value should be the name of the corresponding variable.
a named vector, or a list, declaring the comparison set of values of Z used to define the LATE. The name associated with each value should be the name of the corresponding variable.
a named vector, or a list, declaring the values at which to condition on. The name associated with each value should be the name of the corresponding variable.
lower bound value of unobservable u
for
estimating the generalized LATE.
upper bound value of unobservable u
for
estimating the generalized LATE.
user-defined weight function for the control
group defining the target parameter. A list of functions can be
submitted if the weighting function is in fact a spline. The
arguments of the function should be variable names in
data
. If the weight is constant across all observations,
then the user can instead submit the value of the weight
instead of a function.
user-defined weight function for the treated
group defining the target parameter. See target.weight0
for details.
user-defined set of functions defining the
knots associated with spline weights for the control group. The
arguments of the function should consist only of variable names
in data
. If the knots are constant across all
observations, then the user can instead submit the vector of
knots instead of a function.
user-defined set of functions defining the
knots associated with spline weights for the treated group. See
target.knots0
for details.
one-sided formula for the marginal treatment response
function for the control group. Splines may also be
incorporated using the expression uSpline
, e.g.
uSpline(degree = 2, knots = c(0.4, 0.8), intercept =
TRUE)
. The intercept
argument may be omitted, and is
set to TRUE
by default.
one-sided formula for marginal treatment response
function for treated group. Splines can also be incorporated
using the expression "uSplines(degree, knots, intercept)". The
intercept
argument may be omitted, and is set to
TRUE
by default.
variable name for the unobservable used in declaring the MTRs. The name can be provided with or without quotation marks.
numeric value for upper bound on MTR for the treated group. By default, this will be set to the largest value of the observed outcome in the estimation sample.
numeric value for upper bound on MTR for the control group. By default, this will be set to the largest value of the observed outcome in the estimation sample.
numeric value for lower bound on MTR for the treated group. By default, this will be set to the smallest value of the observed outcome in the estimation sample.
numeric value for lower bound on MTR for the control group. By default, this will be set to the smallest value of the observed outcome in the estimation sample.
numeric value for upper bound on treatment effect parameter of interest.
numeric value for lower bound on treatment effect parameter of interest.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the control group should be weakly
monotone decreasing.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the control group should be weakly
monotone increasing.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the treated group should be weakly
monotone decreasing.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the treated group should be weakly
monotone increasing.
logical, set to FALSE
by default. Set equal
to TRUE
if the MTE should be weakly monotone decreasing.
logical, set to FALSE
by default. Set equal
to TRUE
if the MTE should be weakly monotone increasing.
formula or vector of formulas specifying the
regressions for the IV-like estimands. Which coefficients to
use to define the constraints determining the treatment effect
bounds (alternatively, the moments determining the treatment
effect point estimate) can be selected in the argument
components
.
a list of vectors of the terms in the regression
specifications to include in the set of IV-like estimands. No
terms should be in quotes. To select the intercept term,
include the name intercept
. If the factorized
counterpart of a variable is included in the IV-like
specifications, e.g. factor(x)
where x = 1, 2, 3
,
the user can select the coefficients for specific factors by
declaring the components factor(x)-1, factor(x)-2,
factor(x)-3
. See l
on how to input the
argument. If no components for a IV specification are given,
then all coefficients from that IV specification will be used
to define constraints in the partially identified case, or to
define moments in the point identified case.
a single subset condition or list of subset
conditions corresponding to each regression specified in
ivlike
. The input must be logical. See l
on how to input the argument. If the user wishes to select
specific rows, construct a binary variable in the data set, and
set the condition to use only those observations for which the
binary variable is 1, e.g. the binary variable is use
,
and the subset condition is use == 1
.
formula or variable name corresponding to
propensity to take up treatment. If a formula is declared, then
the function estimates the propensity score according to the
formula and link specified in link
. If a variable name
is declared, then the corresponding column in the data is taken
as the vector of propensity scores. A variable name can be
passed either as a string (e.g propensity = 'p'
). , a
variable (e.g. propensity = p
), or a one-sided formula
(e.g. propensity = ~p
.
character, name of link function to estimate propensity
score. Can be chosen from 'linear'
, 'probit'
, or
'logit'
. Default is set to 'logit'
.
variable name for treatment indicator. The name can be provided with or without quotation marks.
character, name of the linear programming package
in R used to obtain the bounds on the treatment effect. The
function supports 'gurobi'
, 'cplexapi'
,
'lpsolveapi'
.
tolerance for violation of observational
equivalence, set to 0 by default. Statistical noise may
prohibit the theoretical LP problem from being feasible. That
is, there may not exist a set of coefficients on the MTR that
are observationally equivalent with regard to the IV-like
regression coefficients. The function therefore first estimates
the minimum violation of observational equivalence. This is
reported in the output under the name 'minimum criterion'. The
constraints in the LP problem pertaining to observational
equivalence are then relaxed by the amount minimum
criterion * (1 + criterion.tol)
. Set criterion.tol
to a
value greater than 0 to allow for more conservative bounds.
integer determining the number of points of the covariates used to form the initial constraint grid for imposing shape restrictions on the MTRs.
integer determining the number of evenly spread
points in the interval [0, 1] of the unobservable u
used
to form the initial constraint grid for imposing shape
restrictions on the MTRs.
integer determining the number of points on the covariates space to audit in each iteration of the audit procedure.
integer determining the number of points in the
interval [0, 1], corresponding to space of unobservable
u
, to audit in each iteration of the audit procedure.
maximum number of points to add to the initial
constraint grid for imposing each kind of shape constraint. For
example, if there are 5 different kinds of shape constraints,
there can be at most audit.add * 5
additional points
added to the constraint grid.
maximum number of iterations in the audit procedure.
boolean, default set to FALSE
. Set to
TRUE
if it is believed that the treatment effects are
point identified. If set to TRUE
, then a two-step GMM
procedure is implemented to estimate the treatment
effects. Shape constraints on the MTRs will be ignored under
point identification.
boolean, default set to FALSE
. Set to
TRUE
if the GMM point estimate should use the identity
weighting matrix (i.e. one-step GMM).
integer, default set to 0.
integer, default set to size of data
set. Determines the size of the subsample drawn from the
original data set when performing inference via the
bootstrap. This option applies only to the case of constructing
confidence intervals for treatment effect bounds, i.e. it does
not apply when point = TRUE
.
boolean, default set to TRUE
. This
determines whether the resampling procedure used for inference
will sample with replacement.
vector of real numbers between 0 and 1. Values correspond to the level of the confidence intervals constructed via bootstrap.
character, default set to 'both'
. Set to
'forward'
to construct the forward confidence interval
for the treatment effect bound. Set to 'backward'
to
construct the backward confidence interval for the treatment
effect bound. Set to 'both'
to construct both types of
confidence intervals.
boolean, default set to
TRUE
. Function performs a specificaiton test for the
partially identified case when bootstraps > 0
.
boolean, default set to TRUE
. If TRUE
,
then messages are provided throughout the estimation
procedure. Set to FALSE
to suppress all messages,
e.g. when performing the bootstrap.
boolean, default set to FALSE
. Set to
TRUE
to exclude large intermediary components
(i.e. propensity score model, LP model, bootstrap iterations)
from being included in the return list.
integer, the seed that determines the random grid in the audit procedure.
boolean, indicates whether or not the function should
provide output when obtaining bounds. The option is only
applied when lpsolver = 'gurobi'
. The output provided is
the same as what the Gurobi API would send to the console.
Returns a list of results from throughout the estimation procedure. This includes all IV-like estimands; the propensity score model; bounds on the treatment effect; the estimated expectations of each term in the MTRs; the components and results of the LP problem.
The return list includes the following objects.
a list of all the coefficient estimates and weights corresponding to each element in the S-set.
a list containing the estimate of the weighted means
for each component in the MTRs. The weights are determined by the
target parameter declared in target
, or the weights defined
by target.weight1
, target.knots1
,
target.weight0
, target.knots0
.
a list containing the target weights used to
estimate gstar
.
a list containing the coefficients on the treated and control group MTRs.
the propensity score model. If a variable is fed
to the propensity
argument when calling ivmte
, then
the returned object is a list containing the name of variable given by the
user, and the values of that variable used in estimation.
a vector with the estimated lower and upper bounds of the target treatment effect.
a list containing the LP model, and the full output from solving the LP problem.
the audit grid on which all shape constraints were satisfied.
the number of audits required until there were no more violations.
the minimum criterion.
a list including the specifications of each spline declared in each MTR.
# NOT RUN {
dtm <- ivmte:::gendistMosquito()
ivlikespecs <- c(ey ~ d | z,
ey ~ d | factor(z),
ey ~ d,
ey ~ d | factor(z))
jvec <- l(d, d, d, d)
svec <- l(, , , z %in% c(2, 4))
ivmte(ivlike = ivlikespecs,
data = dtm,
components = jvec,
propensity = d ~ z,
subset = svec,
m0 = ~ u + I(u ^ 2),
m1 = ~ u + I(u ^ 2),
uname = u,
target = "att",
m0.dec = TRUE,
m1.dec = TRUE,
bootstraps = 0,
lpsolver = "lpSolveAPI")
# }
Run the code above in your browser using DataLab