ivbounds: Bounds for counterfactual outcome probabilities in instrumental variables scenarios

Description

ivbounds computes non-parametric bounds for counterfactual outcome probabilities in instrumental variables scenarios. Let $Y$, $X$, and $Z$ be the outcome, exposure, and instrument, respectively. $Y$ and $X$ must be binary, whereas $Z$ can be either binary or ternary. Ternary instruments are common in, for instance, Mendelian randomization. Let $p(Y_x=1)$ be the counterfactual probability of the outcome, had all subjects been exposed to level $x$. ivbounds computes bounds for the counterfactuals probabilities $p(Y_1=1)$ and $p(Y_0=1)$. Below, we define $p_{yx.z}=p(Y=y,X=x|Z=x)$.

Usage

ivbounds(data, Z, X, Y, monotonicity=FALSE, weights)

Arguments

data

either a data frame containing the variables in the model, or a named vector (p00.0,...,p11.1) when $Z$ is binary, or a named vector (p00.0,...,p11.2) when $Z$ is ternary.

a string containing the name of the instrument $Z$ in data if data is a data frame. In this case $Z$ has to be coded as (0,1) when binary, and coded as (0,1,2) when ternary. Z is not specified if data is a vector of probabilities.

a string containing the name of the exposure $X$ in data if data is a data frame. In this case $X$ has to be coded as (0,1). X is not specified if data is a vector of probabilities.

a string containing the name of the outcome $Y$ in data if data is a data frame. In this case $Y$ has to be coded as (0,1). Y is not specified if data is a vector of probabilities.

monotonicity

logical. It is sometimes realistic to make the monotonicity assumption $z \geq z' \Rightarrow X_z \geq X_{z'}$. Should the bounds be computed under this assumption?

weights

an optional vector of `prior weights' to be used in the fitting process. Should be NULL or a numeric vector. Only applicable if data is a data frame.

Value

An object of class "ivbounds" is a list containing

call

the matched call.

a named vector with elements "min" and "max", containing the evaluated lower and upper bounds for $p(Y_0=1)$, respectively.

a named vector with elements "min" and "max", containing the evaluated lower and upper bounds for $p(Y_1=1)$, respectively.

p0.symbolic

a named vector with elements "min" and "max", containing the lower and upper bounds for $p(Y_0=1)$, respectively, on a symbolic form (i.e. as strings).

p1.symbolic

a named vector with elements "min" and "max", containing the lower and upper bounds for $p(Y_1=1)$, respectively, on a symbolic form (i.e. as strings).

IVinequality

logical. Does the IV inequality hold?

conditions

a character vector containing the violated condiations, if IVinequality=FALSE.

Details

ivbounds uses linear programming techniques to bound the counterfactual probabilities $p(Y_1=1)$ and $p(Y_0=1)$. Bounds for a causal effect, defined as a contrast between these, are obtained by plugging in the bounds for $p(Y_1=1)$ and $p(Y_0=1)$ into the contrast. For instance, bounds for the causal risk difference $p(Y_1=1)-p(Y_0=1)$ are obtained as $[min\{p(Y_1=1)\}-max\{p(Y_0=1)\},max\{p(Y_1=1)\}-min\{p(Y_0=1)\}]$. In addition to the bounds, ivbounds evaluates the IV inequality $$\max\limits_{x}\sum_{y}\max\limits_{z}p_{yx.z}\leq 1.$$

References

Balke, A. and Pearl, J. (1997). Bounds on treatment effects from studies with imperfect compliance. Journal of the American Statistical Association 92(439), 1171-1176.

Sjolander A., Martinussen T. (2019). Instrumental variable estimation with the R package ivtools. Epidemiologic Methods 8(1), 1-20.

Examples

Run this code

# NOT RUN {
##Vitamin A example from Balke and Pearl (1997).
n000 <- 74
n001 <- 34
n010 <- 0
n011 <- 12
n100 <- 11514
n101 <- 2385
n110 <- 0
n111 <- 9663
n0 <- n000+n010+n100+n110
n1 <- n001+n011+n101+n111

#with data frame...
data <- data.frame(Y=c(0,0,0,0,1,1,1,1), X=c(0,0,1,1,0,0,1,1), 
  Z=c(0,1,0,1,0,1,0,1))
n <- c(n000, n001, n010, n011, n100, n101, n110, n111)
b <- ivbounds(data=data, Z="Z", X="X", Y="Y", weights=n)
summary(b)

#...or with vector of probabilities
p <- n/rep(c(n0, n1), 4)
names(p) <- c("p00.0", "p00.1", "p01.0", "p01.1", 
  "p10.0", "p10.1", "p11.0", "p11.1") 
b <- ivbounds(data=p)
summary(b)



# }

Run the code above in your browser using DataLab