iv: Bayesian Analysis of Binary Instrumental Variables

Description

Perform Bayesian instrumental variable analysis for binary data. This assumes the number of covariates is small enough. Notice that a set of size greater than 20 will raise a flag and require explicit permission from the user by setting force_use to TRUE.

Usage

iv(problem, w, Z, prior_table, M, verbose = FALSE, reject_level = 0.9, force_use = FALSE)

Arguments

problem

a CausalFX problem instance.

index of instrumental variable within the data.

array of indices for covariate adjustment.

prior_table

Dirichlet hyperparameter for contingency tables.

number of Monte Carlo samples to compute posterior distributions.

verbose

print relevant messages.

reject_level

if first iteration of rejection sampling has a proportion of rejected samples larger than this, reject the model and return no bounds.

force_use

if TRUE, ignore any warnings about size of Z.

Value

bound: the point estimate of lower and upper bounds.
bound_post: samples from the posterior distribution over bounds.

Details

Given a joint distribution, this function finds a lower bound and an upper bound on the average causal effect of treatment $X$ on outcome $Y$, adjusting for covariate set $Z$. The joint distribution is estimated by assigning a prior to joint contingency table of w, Z and the treatment/outcome pair indexed in problem, and ACE bounds are inferred by computing the posterior on the contingency table implied by the instrumental variable assumption. The prior is proportional to a Dirichlet distribution with an effective sample size prior_table. It assigns probability zero to models which do not satisfy the constraints of the (conditional) instrumental variable, as described by Balke and Pearl (1997, Journal of the American Statistical Association, vol. 92, 1171--1176). Hence, the prior is not an exact Dirichlet distribution, but only proportional to one. Posterior samples for the lower and upper bound are generated by rejection sampling using the unconstrained model as a proposal. If the model is a bad fit to the data, this might take much computing time. Setting reject_level to a level smaller than 1 may stop the rejection sampler earlier and reject the model, returning no bounds.

References

http://ftp.cs.ucla.edu/pub/stat_ser/r199-jasa.pdf

Examples

Run this code

## Generate synthetic problems until a (conditional) instrumental variable can be found

while (TRUE) {
  problem <- simulateWitnessModel(p = 4, q = 4, par_max = 3, M = 1000)
  s <- covsearch(problem, pop_solve = TRUE)
  if (length(s$witness) > 0) {
    w <- s$witness[1]
    Z <- s$Z[[1]]
    break
  }
}

## Calculate true effect for evaluation purposes
sol_pop <- covsearch(problem, pop_solve = TRUE)
effect_pop <- synthetizeCausalEffect(problem)
cat(sprintf("ACE (true) = %1.2f\n", effect_pop$effect_real))

## Binary IV bounds
sol_iv <- iv(problem, w, Z, prior_table = 10, M = 1000)
summary(sol_iv)

Run the code above in your browser using DataLab