NoncompLI: Bayesian Analysis of Randomized Experiments with Noncompliance and Missing Outcomes Under the Assumption of Latent Ignorability

Description

This function estimates the average causal effects for randomized experiments with noncompliance and missing outcomes under the assumption of latent ignorability (Frangakis and Rubin, 1999). The models are based on Bayesian generalized linear models and are fitted using the Markov chain Monte Carlo algorithms. Various types of the outcome variables can be analyzed to estimate the Intention-to-Treat effect and Complier Average Causal Effect.

Usage

NoncompLI(
  formulae,
  Z,
  D,
  data = parent.frame(),
  n.draws = 5000,
  param = TRUE,
  in.sample = FALSE,
  model.c = "probit",
  model.o = "probit",
  model.r = "probit",
  tune.c = 0.01,
  tune.o = 0.01,
  tune.r = 0.01,
  tune.v = 0.01,
  p.mean.c = 0,
  p.mean.o = 0,
  p.mean.r = 0,
  p.prec.c = 0.001,
  p.prec.o = 0.001,
  p.prec.r = 0.001,
  p.df.o = 10,
  p.scale.o = 1,
  p.shape.o = 1,
  mda.probit = TRUE,
  coef.start.c = 0,
  coef.start.o = 0,
  tau.start.o = NULL,
  coef.start.r = 0,
  var.start.o = 1,
  burnin = 0,
  thin = 0,
  verbose = TRUE
)

Arguments

formulae

A list of formulae where the first formula specifies the (pre-treatment) covariates in the outcome model (the latent compliance covariate will be added automatically), the second formula specifies the compliance model, and the third formula defines the covariate specification for the model for missing-data mechanism (the latent compliance covariate will be added automatically). For the outcome model, the formula should take the two-sided standard R formula where the outcome variable is specified in the left hand side of the formula which is then separated by ~ from the covariate equation in the right hand side, e.g., y ~ x1 + x2. For the compliance and missing-data mechanism models, the one-sided formula should be used where the left hand side is left unspecified, e.g., ~ x1 + x2.

A randomized encouragement variable, which should be a binary variable in the specified data frame.

A treatment variable, which should be a binary variable in the specified data frame.

data

A data frame which contains the variables that appear in the model formulae (formulae), the encouragement variable (Z), and the treatment variable (D).

n.draws

The number of MCMC draws. The default is 5000.

param

A logical variable indicating whether the Monte Carlo draws of the model parameters should be saved in the output object. The default is TRUE.

in.sample

A logical variable indicating whether or not the sample average causal effect should be calculated using the observed potential outcome for each unit. If it is set to FALSE, then the population average causal effect will be calculated. The default is FALSE.

model.c

The model for compliance. Either logit or probit model is allowed. The default is probit.

model.o

The model for outcome. The following five models are allowed: logit, probit, oprobit (ordered probit regression), gaussian (gaussian regression), negbin (negative binomial regression), and twopart (two part model where the first part is the probit regression for \(Pr(Y>0|X)\) and the second part models \(p(log(Y)|X, Y>0)\) using the gaussian regression). The default is probit.

model.r

The model for (non)response. Either logit or probit model is allowed. The default is probit.

tune.c

Tuning constants for fitting the compliance model. These positive constants are used to tune the (random-walk) Metropolis-Hastings algorithm to fit the logit model. Use either a scalar or a vector of constants whose length equals that of the coefficient vector. The default is 0.01.

tune.o

Tuning constants for fitting the outcome model. These positive constants are used to tune the (random-walk) Metropolis-Hastings algorithm to fit logit, ordered probit, and negative binomial models. Use either a scalar or a vector of constants whose length equals that of the coefficient vector for logit and negative binomial models. For the ordered probit model, use either a scalar or a vector of constants whose length equals that of cut-point parameters to be estimated. The default is 0.01.

tune.r

Tuning constants for fitting the (non)response model. These positive constants are used to tune the (random-walk) Metropolis-Hastings algorithm to fit the logit model. Use either a scalar or a vector of constants whose length equals that of the coefficient vector. The default is 0.01.

tune.v

A scalar tuning constant for fitting the variance component of the negative binomial (outcome) model. The default is 0.01.

p.mean.c

Prior mean for the compliance model. It should be either a scalar or a vector of appropriate length. The default is 0.

p.mean.o

Prior mean for the outcome model. It should be either a scalar or a vector of appropriate length. The default is 0.

p.mean.r

Prior mean for the (non)response model. It should be either a scalar or a vector of appropriate length. The default is 0.

p.prec.c

Prior precision for the compliance model. It should be either a positive scalar or a positive semi-definite matrix of appropriate size. The default is 0.001.

p.prec.o

Prior precision for the outcome model. It should be either a positive scalar or a positive semi-definite matrix of appropriate size. The default is 0.001.

p.prec.r

Prior precision for the (non)response model. It should be either a positive scalar or a positive semi-definite matrix of appropriate size. The default is 0.001.

p.df.o

A positive integer. Prior degrees of freedom parameter for the inverse chisquare distribution in the gaussian and twopart (outcome) models. The default is 10.

p.scale.o

A positive scalar. Prior scale parameter for the inverse chisquare distribution (for the variance) in the gaussian and twopart (outcome) models. For the negative binomial (outcome) model, this is used for the scale parameter of the inverse gamma distribution. The default is 1.

p.shape.o

A positive scalar. Prior shape for the inverse chisquare distribution in the negative binomial (outcome) model. The default is 1.

mda.probit

A logical variable indicating whether to use marginal data augmentation for probit models. The default is TRUE.

coef.start.c

Starting values for coefficients of the compliance model. It should be either a scalar or a vector of appropriate length. The default is 0.

coef.start.o

Starting values for coefficients of the outcome model. It should be either a scalar or a vector of appropriate length. The default is 0.

tau.start.o

Starting values for thresholds of the ordered probit (outcome) model. If it is set to NULL, then the starting values will be a sequence starting from 0 and then incrementing by 0.1. The default is NULL.

coef.start.r

Starting values for coefficients of the (non)response model. It should be either a scalar or a vector of appropriate length. The default is 0.

var.start.o

A positive scalar starting value for the variance of the gaussian, negative binomial, and twopart (outcome) models. The default is 1.

burnin

The number of initial burnins for the Markov chain. The default is 0.

thin

The size of thinning interval for the Markov chain. The default is 0.

verbose

A logical variable indicating whether additional progress reports should be prited while running the code. The default is TRUE.

Value

An object of class NoncompLI which contains the following elements as a list:

call

The matched call.

The outcome variable.

The treatment variable.

The (randomized) encouragement variable.

The response indicator variable for Y.

The indicator variable for (known) always-takers, i.e., the control units who received the treatment.

The indicator variable for (known) compliers, i.e., the encouraged units who received the treatment when there is no always-takers.

The matrix of covariates used for the outcome model.

The matrix of covariates used for the compliance model.

The matrix of covariates used for the (non)response model.

n.draws

The number of MCMC draws.

QoI

The Monte carlo draws of quantities of interest from their posterior distributions. Quantities of interest include ITT (intention-to-treat) effect, CACE (complier average causal effect), Y1barC (The mean outcome value under the treatment for compliers), Y0barC (The mean outcome value under the control for compliers), YbarN (The mean outcome value for never-takers), YbarA (The mean outcome value for always-takers), pC (The proportion of compliers), pN (The proportion of never-takers), pA (The proportion of always-takers)

If param is set to TRUE, the following elments are also included:

coefO

The Monte carlo draws of coefficients of the outcome model from their posterior distribution.

coefO1

If model = "twopart", this element contains the Monte carlo draws of coefficients of the outcome model for \(p(log(Y)|X, Y > 0)\) from their posterior distribution.

coefC

The Monte carlo draws of coefficients of the compliance model from their posterior distribution.

coefA

If always-takers exist, then this element contains the Monte carlo draws of coefficients of the compliance model for always-takers from their posterior distribution.

coefR

The Monte carlo draws of coefficients of the (non)response model from their posterior distribution.

sig2

The Monte carlo draws of the variance parameter for the gaussian, negative binomial, and twopart (outcome) models.

Details

For the details of the model being fitted, see the references. Note that when always-takers exist we fit either two logistic or two probit models by first modeling whether a unit is a complier or a noncomplier, and then modeling whether a unit is an always-taker or a never-taker for those who are classified as non-compliers.

References

Frangakis, Constantine E. and Donald B. Rubin. (1999). “Addressing Complications of Intention-to-Treat Analysis in the Combined Presence of All-or-None Treatment Noncompliance and Subsequent Missing Outcomes.” Biometrika, Vol. 86, No. 2, pp. 365-379.

Hirano, Keisuke, Guido W. Imbens, Donald B. Rubin, and Xiao-Hua Zhou. (2000). “Assessing the Effect of an Influenza Vaccine in an Encouragement Design.” Biostatistics, Vol. 1, No. 1, pp. 69-88.

Barnard, John, Constantine E. Frangakis, Jennifer L. Hill, and Donald B. Rubin. (2003). “Principal Stratification Approach to Broken Randomized Experiments: A Case Study of School Choice Vouchers in New York (with Discussion)”, Journal of the American Statistical Association, Vol. 98, No. 462, pp299--311.

Horiuchi, Yusaku, Kosuke Imai, and Naoko Taniguchi (2007). “Designing and Analyzing Randomized Experiments: Application to a Japanese Election Survey Experiment.” American Journal of Political Science, Vol. 51, No. 3 (July), pp. 669-687.