This function estimates the treatment effect parameters, following
the procedure described in Mogstad, Santos and Torgovitsky (2018)
(10.3982/ECTA15463). A detailed description of the module and
its features can be found in
Shea
and Torgovitsky (2021). However, this is not the main function of
the module. See ivmte
for the main function. For
examples of how to use the package, see the vignette, which is
available on the module's
GitHub page.
ivmteEstimate(
data,
target,
late.Z,
late.from,
late.to,
late.X,
eval.X,
genlate.lb,
genlate.ub,
target.weight0,
target.weight1,
target.knots0 = NULL,
target.knots1 = NULL,
m0,
m1,
uname = u,
m1.ub,
m0.ub,
m1.lb,
m0.lb,
mte.ub,
mte.lb,
m0.dec,
m0.inc,
m1.dec,
m1.inc,
mte.dec,
mte.inc,
equal.coef,
ivlike,
components,
subset,
propensity,
link = "logit",
treat,
solver,
solver.options,
solver.presolve,
solver.options.criterion,
solver.options.bounds,
criterion.tol = 0.01,
initgrid.nx = 20,
initgrid.nu = 20,
audit.nx = 2500,
audit.nu = 25,
audit.add = 100,
audit.max = 25,
audit.tol,
audit.grid = NULL,
rescale = TRUE,
point = FALSE,
point.eyeweight = FALSE,
point.center = NULL,
point.redundant = NULL,
bootstrap = FALSE,
count.moments = TRUE,
orig.sset = NULL,
orig.criterion = NULL,
vars_y,
vars_mtr,
terms_mtr0,
terms_mtr1,
vars_data,
splinesobj,
splinesobj.equal,
noisy = TRUE,
smallreturnlist = FALSE,
debug = FALSE,
environments
)
data.frame
or data.table
used to estimate
the treatment effects.
character, target parameter to be estimated. The
function allows for ATE ('ate'
), ATT ('att'
), ATU
('atu'
), LATE ('late'
), and generalized LATE
('genlate'
).
vector of variable names used to define the LATE.
baseline set of values of Z used to define the LATE.
comparison set of values of Z used to define the LATE.
vector of variable names of covariates to condition on when defining the LATE.
numeric vector of the values to condition variables
in late.X
on when estimating the LATE.
lower bound value of unobservable u
for
estimating the generalized LATE.
upper bound value of unobservable u
for
estimating the generalized LATE.
user-defined weight function for the control
group defining the target parameter. A list of functions can be
submitted if the weighting function is in fact a spline. The
arguments of the function should be variable names in
data
. If the weight is constant across all observations,
then the user can instead submit the value of the weight
instead of a function.
user-defined weight function for the treated
group defining the target parameter. See target.weight0
for details.
user-defined set of functions defining the
knots associated with spline weights for the control group. The
arguments of the function should consist only of variable names
in data
. If the knots are constant across all
observations, then the user can instead submit the vector of
knots instead of a function.
user-defined set of functions defining the
knots associated with spline weights for the treated group. See
target.knots0
for details.
one-sided formula for the marginal treatment response
function for the control group. Splines may also be
incorporated using the expression uSpline
, e.g.
uSpline(degree = 2, knots = c(0.4, 0.8), intercept =
TRUE)
. The intercept
argument may be omitted, and is
set to TRUE
by default.
one-sided formula for the marginal treatment response
function for the treated group. See m0
for details.
variable name for the unobservable used in declaring the MTRs. The name can be provided with or without quotation marks.
numeric value for upper bound on MTR for the treated group. By default, this will be set to the largest value of the observed outcome in the estimation sample.
numeric value for upper bound on MTR for the control group. By default, this will be set to the largest value of the observed outcome in the estimation sample.
numeric value for lower bound on MTR for the treated group. By default, this will be set to the smallest value of the observed outcome in the estimation sample.
numeric value for lower bound on MTR for the control group. By default, this will be set to the smallest value of the observed outcome in the estimation sample.
numeric value for upper bound on treatment effect parameter of interest.
numeric value for lower bound on treatment effect parameter of interest.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the control group should be weakly
monotone decreasing.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the control group should be weakly
monotone increasing.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the treated group should be weakly
monotone decreasing.
logical, set to FALSE
by default. Set equal to
TRUE
if the MTR for the treated group should be weakly
monotone increasing.
logical, set to FALSE
by default. Set equal
to TRUE
if the MTE should be weakly monotone decreasing.
logical, set to FALSE
by default. Set equal
to TRUE
if the MTE should be weakly monotone increasing.
one-sided formula to indicate which terms in
m0
and m1
should be constrained to have the same
coefficients. These terms therefore have no effect on the MTE.
formula or vector of formulas specifying the
regressions for the IV-like estimands. Which coefficients to
use to define the constraints determining the treatment effect
bounds (alternatively, the moments determining the treatment
effect point estimate) can be selected in the argument
components
. If no argument is passed, then a linear
regression will be performed to estimate the MTR coefficients.
a list of vectors of the terms in the regression
specifications to include in the set of IV-like estimands. No
terms should be in quotes. To select the intercept term,
include the name intercept
. If the factorized
counterpart of a variable is included in the IV-like
specifications, e.g. factor(x)
where x = 1, 2, 3
,
the user can select the coefficients for specific factors by
declaring the components factor(x)-1, factor(x)-2,
factor(x)-3
. See l
on how to input the
argument. If no components for a IV specification are given,
then all coefficients from that IV specification will be used
to define constraints in the partially identified case, or to
define moments in the point identified case.
a single subset condition or list of subset
conditions corresponding to each regression specified in
ivlike
. The input must be logical. See l
on how to input the argument. If the user wishes to select
specific rows, construct a binary variable in the data set, and
set the condition to use only those observations for which the
binary variable is 1, e.g. the binary variable is use
,
and the subset condition is use == 1
.
formula or variable name corresponding to
propensity to take up treatment. If a formula is declared, then
the function estimates the propensity score according to the
formula and link specified in link
. If a variable name
is declared, then the corresponding column in the data is taken
as the vector of propensity scores. A variable name can be
passed either as a string (e.g propensity = 'p'
), a
variable (e.g. propensity = p
), or a one-sided formula
(e.g. propensity = ~p
).
character, name of link function to estimate propensity
score. Can be chosen from 'linear'
, 'probit'
, or
'logit'
. Default is set to 'logit'
. The link
should be provided with quoation marks.
variable name for treatment indicator. The name can be provided with or without quotation marks.
character, name of the programming package in R used
to obtain the bounds on the treatment effect. The function
supports 'gurobi'
, 'cplexapi'
, rmosek
,
'lpsolveapi'
. The name of the solver should be provided
with quotation marks.
list, each item of the list should correspond to an option specific to the solver selected.
boolean, default set to TRUE
. Set
this parameter to FALSE
if presolve should be turned off
for the LP/QCQP problems.
list, each item of the list should correspond to an option specific to the solver selected. These options are specific for finding the minimum criterion.
list, each item of the list should correspond to an option specific to the solver selected. These options are specific for finding the bounds.
tolerance for the criterion function, and is
set to 1e-4 by default. The criterion measures how well the
IV-like moments/conditional means are matched using the
l1-norm. Statistical noise may prohibit the theoretical LP/QCQP
problem from being feasible. That is, there may not exist a set
of MTR coefficients that are able to match all the specified
moments. The function thus first estimates the minimum
criterion, which is reported in the output under the name
'minimum criterion', with a criterion of 0 meaning that all
moments were able to be matched. The function then relaxes the
constraints by tolerating a criterion up to minimum
criterion * (1 + criterion.tol)
. Set criterion.tol
to a
value greater than 0 to allow for more conservative bounds.
integer determining the number of points of the covariates used to form the initial constraint grid for imposing shape restrictions on the MTRs.
integer determining the number of points in the
open interval (0, 1) drawn from a Halton sequence. The end
points 0 and 1 are additionally included. These points are
always a subset of the points defining the audit grid (see
audit.nu
). These points are used to form the initial
constraint grid for imposing shape restrictions on the u
components of the MTRs.
integer determining the number of points on the covariates space to audit in each iteration of the audit procedure.
integer determining the number of points in the
open interval (0, 1) drawn from a Halton sequence. The end
points 0 and 1 are additionally included. These points are used
to audit whether the shape restrictions on the u
components of the MTRs are satisfied. The initial grid used to
impose the shape constraints in the LP/QCQP problem are
constructed from a subset of these points.
maximum number of points to add to the initial
constraint grid for imposing each kind of shape constraint. For
example, if there are 5 different kinds of shape constraints,
there can be at most audit.add * 5
additional points
added to the constraint grid.
maximum number of iterations in the audit procedure.
feasibility tolerance when performing the
audit. By default to set to be 1e-06, which is equal to the
default feasibility tolerances of Gurobi (solver =
"gurobi"
), CPLEX (solver = "cplexapi"
), and Rmosek
(solver = "rmosek"
). This parameter should only be
changed if the feasibility tolerance of the solver is changed,
or if numerical issues result in discrepancies between the
solver's feasibility check and the audit.
list, contains the A
matrix used in the
audit for the original sample, as well as the RHS vector used
in the audit from the original sample.
boolean, set to TRUE
by default. This
rescalels the MTR components to improve stability in the
LP/QCQP optimization.
boolean. Set to TRUE
if it is believed that the
treatment effects are point identified. If set to TRUE
and IV-like formulas are passed, then a two-step GMM procedure
is implemented to estimate the treatment effects. Shape
constraints on the MTRs will be ignored under point
identification. If set to TRUE
and the regression-based
criteria is used instead, then OLS will be used to estimate the
MTR coefficients used to estimate the treatment effect. If not
declared, then the function will determine whether or not the
target parameter is point identified.
boolean, default set to FALSE
. Set to
TRUE
if the GMM point estimate should use the identity
weighting matrix (i.e. one-step GMM).
numeric, a vector of GMM moment conditions evaluated at a solution. When bootstrapping, the moment conditions from the original sample can be passed through this argument to recenter the bootstrap distribution of the J-statistic.
vector of integers indicating which components in the S-set are redundant.
boolean, indicates whether the estimate is for the bootstrap.
boolean, indicate if number of linearly independent moments should be counted.
list, only used for bootstraps. The list contains the gamma moments for each element in the S-set, as well as the IV-like coefficients.
numeric, only used for bootstraps. The scalar corresponds to the minimum observational equivalence criterion from the original sample.
character, variable name of observed outcome variable.
character, vector of variables entering into
m0
and m1
.
character, vector of terms entering into
m0
.
character, vector of terms entering into
m1
.
character, vector of variables that can be found in the data.
list of spline components in the MTRs for treated
and control groups. Spline terms are extracted using
removeSplines
. This object is supposed to be a
dictionary of splines, containing the original calls of each
spline in the MTRs, their specifications, and the index used
for naming each basis spline.
list of spline components in the MTRs for
treated and control groups. The structure of
splinesobj.equal
is the same as splinesobj
,
except the splines are restricted to those whose MTR cofficients
should be constrained to be equal across treatment groups.
boolean, default set to TRUE
. If TRUE
,
then messages are provided throughout the estimation
procedure. Set to FALSE
to suppress all messages,
e.g. when performing the bootstrap.
boolean, default set to FALSE
. Set to
TRUE
to exclude large intermediary components
(i.e. propensity score model, LP/QCQP model, bootstrap
iterations) from being included in the return list.
boolean, indicates whether or not the function should
provide output when obtaining bounds. The option is only
applied when solver = 'gurobi'
or solver =
'rmosek'
. The output provided is the same as what the Gurobi
API would send to the console.
a list containing the environments of the MTR formulas, the IV-like formulas, and the propensity score formulas. If a formula is not provided, and thus no environment can be found, then the parent.frame() is assigned by default.
Returns a list of results from throughout the estimation procedure. This includes all IV-like estimands; the propensity score model; bounds on the treatment effect; the estimated expectations of each term in the MTRs; the components and results of the LP/QCQP problem.
The treatment effects parameters the user can choose from are the ATE, ATT, ATU, LATE, and generalized LATE. The user is required to provide a polynomial expression for the marginal treatment responses (MTR), as well as a set of regressions.
There are two approaches to estimating the treatment effect
parameters. The first approach restricts the set of MTR
coefficients on each term of the MTRs to be consistent with the
regression estimates from the specifications passed through
ivlike
. The bounds on the treatment effect parameter
correspond to finding coefficients on the MTRs that maximize their
average difference. If the model is point identified, then GMM is
used for estimation. Otherwise, the function solves an LP
problem. The second approach restricts the set of MTR coefficients
to fit the conditional mean of the outcome variable. If the model
is point identified, then constrained least squares is used for
estimation. Otherwise, the function solves a QCQP.
The estimation procedure relies on the propensity to take up treatment. The propensity scores can either be estimated as part of the estimation procedure, or the user can specify a variable in the data set already containing the propensity scores.
Constraints on the shape of the MTRs and marginal treatment effects (MTE) can be imposed by the user. Specifically, bounds and monotonicity restrictions are permitted. These constraints are first enforced over a subset of points in the data. An iterative audit procedure is then performed to ensure the constraints hold more generally.