interference: Estimate Causal Effects in presence of interference

Description

Estimate Causal Effects in presence of interference

Usage

interference(
  formula,
  propensity_integrand = "logit_integrand",
  loglihood_integrand = propensity_integrand,
  allocations,
  data,
  model_method = "glmer",
  model_options = list(family = stats::binomial(link = "logit")),
  causal_estimation_method = "ipw",
  causal_estimation_options = list(variance_estimation = "robust"),
  conf.level = 0.95,
  rescale.factor = 1,
  integrate_allocation = TRUE,
  runSilent = TRUE,
  ...
)

Arguments

formula

The formula used to define the causal model. Has a minimum of 4 parts, separated by | and ~ in a specific structure: outcome | exposure ~ propensity covariates | group. The order matters, and the pipes split the data frame into corresponding pieces. The part separated by ~ is passed to the chosen model_method used to estimate or fix propensity parameters.

propensity_integrand

A function, which may be created by the user, used to compute the IP weights. This defaults to logit_integrand, which calculates the product of inverse logits for individuals in a group: $\prod_{j = 1}^{n_i} \{r \times h_{ij}(b_i)^{A_{ij}}\}\{1 - r \times h_{ij}(b_i)\}^{1 - A_{ij}} f_b(b_i; \theta_s)$ where $$h_{ij}(b_i) = logit^{-1} (\mathbf{X}_{ij}\theta_a + b_i)$$ and $b_i$ is a group-level random effect, $f_b$ is a $N(0, \theta_s)$ density, and $r$ is a known randomization probability which may be useful if a participation vector is included in the formula. If no random effect was included in the formula, logit_integrand essentially ignores the random effect and $f_b(b_i, \theta_s)$ integrates to 1. See details for arguments that can be passed to logit_integrand

loglihood_integrand

A function, which may be created by the user, that defines the log likelihood of the logit model used for robust variance estimation. Generally, this will be the same function as propensity_integrand. Indeed, this is the default.

allocations

a vector of values in (0, 1). Increasing the number of elements of the allocation vector greatly increases computation time; however, a larger number of allocations will make plots look nicer. A minimum of two allocations is required.

data

the analysis data frame. This must include all the variables defined in the formula.

model_method

the method used to estimate or set the propensity model parameters. Must be one of 'glm', 'glmer', or 'oracle'. Defaults to 'glmer'. For a fixed effects only model use 'glm', and to include random effects use'glmer'. logit_integrand only supports a single random effect for the grouping variable, so if more random effects are included in the model, different propensity_integrand and loglihood_integrand functions should be defined. When the propensity parameters are known (as in simulations) or if estimating parameters by other methods, use the 'oracle' option. See model_options for details on how to pass the oracle parameters.

model_options

a list of options passed to the function in model_method. Defaults to list(family = binomial(link = 'logit')). When model_method = 'oracle', the list must have two elements (1) fixed_effects and (2) random_effects. If the model did not include random effects, set random.effects = NULL.

causal_estimation_method

currently only supports 'ipw'.

causal_estimation_options

A list. Current options are: (1) variance_estimation is either 'naive' or 'robust'. See details. Defaults to 'robust'.

conf.level

level for confidence intervals. Defaults to 0.95.

rescale.factor

a scalar multiplication factor by which to rescale outcomes and effects. Defaults to 1.

integrate_allocation

Indicator of whether the integrand function uses the allocation parameter. Defaults to TRUE.

runSilent

if FALSE, status of computations are printed to console. Defaults to TRUE.

...

Used to pass additional arguments to internal functions such as numDeriv::grad() or integrate(). Additionally, arguments can be passed to the propensity_integrand and loglihood_integrand functions.

Value

Returns a list of overall and group-level IPW point estimates, overall and group-level IPW point estimates (using the weight derivatives), derivatives of the loglihood, the computed weight matrix, the computed weight derivative array, and a summary.

Details

The following formula includes a random effect for the group: outcome | exposure ~ propensity covariates + (1|group) | group. In this instance, the group variable appears twice. If the study design includes a "participation" variable, this is easily added to the formula: outcome | exposure | participation ~ propensity covariates | group.

logit_integrand has two options that can be passed via the ... argument:

randomization: a scalar. This is the $r$ in the formula just above. It defaults to 1 in the case that a participation vector is not included. The vaccine study example demonstrates use of this argument.
integrate_allocation: TRUE/FALSE. When group sizes grow large (over 1000), the product term of logit_integrand tends quickly to 0. When set to TRUE, the IP weights tend less quickly to 0. Defaults to FALSE.

If the true propensity model is known (e.g. in simulations) use variance_estimatation = 'naive'; otherwise, use the default variance_estimatation = 'robust'. Refer to the web appendix of Heydrich-Perez et al. (2014) (10.1111/biom.12184) for complete details.

References

Saul, B. and Hugdens, M. G. (2017). A Recipe for inferference: Start with Causal Inference. Add Interference. Mix Well with R. Journal of Statistical Software, 82(2), 1-21. 10.18637/jss.v082.i02

Perez-Heydrich, C., Hudgens, M. G., Halloran, M. E., Clemens, J. D., Ali, M., & Emch, M. E. (2014). Assessing effects of cholera vaccination in the presence of interference. Biometrics, 70(3), 731-741.

Tchetgen Tchetgen, E. J., & VanderWeele, T. J. (2012). On causal inference in the presence of interference. Statistical Methods in Medical Research, 21(1), 55-75.