covsearch(problem, max_set = 12, min_only = TRUE, prior_ind = 0.5, prior_table = 10, cred_calc = FALSE, M = 1000, stop_at_first = FALSE, pop_solve = FALSE, verbose = FALSE)cfx problem instance for the ACE of a given treatment $X$ on a given outcome $Y$.TRUE, compute conditional credible intervals for the ACE of highest scoring model.TRUE, stop as soon as some witness is found.TRUE, assume we know the population graph in problem instead of data.TRUE, print out more detailed information while running the procedure.problem plus the following items:
witnessZZ[[i]] is the $i$-th array containing the indices of the variables in
the admissible set corresponding to witness witness[i].witness_scorehwhZACEsACEs_posthW and hZ.problem (other than problem$X_idx and
problem$Y_idx) are covariates which causally precede treatment and outcome. It then applies the
faithfulness condition of Spirtes, Glymour and Scheines (2000, Causation, Prediction and Search, MIT Press)
to derive an admissible set: a set of covariates which removes all confounding between treatment and outcome
when adjusted for.
The necessary and sufficient conditions for finding an admissible set using the faithfulness assumption were
discussed by Enter, Hoyer and Spirtes (2013, JMLR W&CP, vol. 31, 256--264). In order for a set to be proved
an admissible set, some auxiliary variable in the covariate set is necessary - we call this variable a "witness."
See Entner et al. for details. It is possible that no witness exists, which in this case the function returns an
empty solution. Multiple witness/admissible sets might exist. The criterion for finding a witness/admissible set
pair requires the testing of conditional independence constraints. The test is done by performing Bayesian model selection
with a Dirichlet prior over the contingency table of the variables in problem using the effective sample size
hyperparameter prior_table, and a prior probability of the independence hypothesis using the hyperparameter
prior_ind.For each witness/admissible set that passes this criterion, the function reports the posterior expected value of the implied ACE for each pair, by first plugging-in the posterior expected value of the contingency table as an estimate of the joint distribution. For a particular pair of witness/admissible set, chosen according to the best fit to the conditional independencies required by the criterion of Enter et al. (see also Silva and Evans, 2014, NIPS 298-306), we calculate the posterior distribution of the ACE. This posterior does not take into account the uncertainty on the choice of witness/admissible set, but instead is the conditional posterior given this choice.
The search for a witness/admissible set is by brute-force: for each witness, evaluate all subsets of the remaining
covariates as candidate admissible sets. If there are too many covariates (more than max_set), only a filtered set
of size max_set is considered for each witness. The set is chosen by first scoring each covariate by its empirical mutual
information with the witness given problem$X_idx and picking the top max_set elements, to which a brute-force
search is then applied.
http://papers.nips.cc/paper/5602-causal-inference-through-a-witness-protection-program
## Generate a synthetic problem
problem <- simulateWitnessModel(p = 4, q = 4, par_max = 3, M = 1000)
## Idealized case: suppose we know the true distribution,
## get "exact" ACE estimands for different adjustment sets
sol_pop <- covsearch(problem, pop_solve = TRUE)
effect_pop <- synthetizeCausalEffect(problem)
cat(sprintf(
"ACE (true) = %1.2f\nACE (adjusting for all) = %1.2f\nACE (adjusting for nothing) = %1.2f\n",
effect_pop$effect_real, effect_pop$effect_naive, effect_pop$effect_naive2))
## Perform inference and report results
covariate_hat <- covsearch(problem, cred_calc = TRUE, M = 1000)
summary(covariate_hat)
Run the code above in your browser using DataLab