mediate: Causal Mediation Analysis

Description

'mediate' is used to estimate various quantities for causal mediation analysis, including average causal mediation effects (indirect effect), average direct effects, proportions mediated, and total effect.

Usage

mediate(model.m, model.y, sims=1000, boot=FALSE, 
        treat="treat.name", mediator="med.name", 
        control=NULL, conf.level=.95, 
        control.value=0, treat.value=1, 
        long=TRUE, dropobs=FALSE, robustSE=FALSE, ...)

Arguments

model.m

a fitted model object for mediator. Can be of class 'lm', 'polr', 'glm', 'gam', or 'rq'.

model.y

a fitted model object for outcome. Can be of class 'lm', 'polr', 'glm', 'gam', 'vglm', or 'rq'.

sims

number of Monte Carlo draws for nonparametric bootstrap or quasi-Bayesian approximation.

boot

a logical value. if 'FALSE' a quasi-Bayesian approximation is used for confidence intervals; if 'TRUE' nonparametric bootstrap will be used. Default is 'FALSE'.

conf.level

level of the returned two-sided confidence intervals. Default is to return the 2.5 and 97.5 percentiles of the simulated quantities.

treat

a character string indicating the name of treatment variable used in the models. The treatment can be either binary (integer or a two-valued factor) or continuous (numeric).

mediator

a character string indicating the name of mediator variable used in the models.

control

a character string indicating the name of control group indicator. Only relevant if 'model.y' is of class 'gam'. If provided, 'd0', 'z0' and 'n0' are allowed to differ from 'd1', 'z1' and 'n1', respectively.

control.value

value of the treatment variable used as the control condition. Only relevant when 'treat' is continuous. Default is 0.

treat.value

value of the treatment variable used as the treatment condition. Only relevant when 'treat' is continuous. Default is 1.

long

a logical value. If 'TRUE', the output will contain the entire sets of simulation draws of the the average causal mediation effects, direct effects, proportions mediated, and total effect. Default is 'TRUE'.

dropobs

a logical value indicating the behavior when the model frames of 'model.m' and 'model.y' are composed of different observations. If 'TRUE', models will be re-fitted using the intersection of the two data frames. If 'FALSE', error is returned. Default is '

robustSE

a logical value. If 'TRUE', heteroskedasticisy-consistent standard errors will be used in quasi-Bayesian simulations. Ignored if 'boot' is 'TRUE' or neither 'model.m' nor 'model.y' has a method for vcovHC in the sandwich package.

...

other arguments passed to vcovHC in the sandwich package: typically the 'type' argument. Ignored if 'robustSE' is 'FALSE'.

Value

mediate returns an object of class "mediate" (or "mediate.order" if the outcome model used is 'polr'), a list that contains the components listed below. Some of these elements are not available if 'long' is set to 'FALSE' by the user. The function summary (i.e., summary.mediate or summary.mediate.order) can be used to obtain a table of the results. The function plot (i.e., plot.mediate or plot.mediate.order) can be used to produce a plot of the estimated average causal mediation, average direct, and total effects along with their confidence intervals.
d0, d1point estimates for average causal mediation effects under the control and treatment conditions.
d0.ci, d1.ciconfidence intervals for average causal mediation effects. The confidence level is set at the value specified in 'conf.level'.
d0.sims, d1.simsvectors of length 'sims' containing simulation draws of average causal mediation effects.
z0, z1point estimates for average direct effect under the control and treatment conditions.
z0.ci, z1.ciconfidence intervals for average direct effects.
z0.sims, z1.simsvectors of length 'sims' containing simulation draws of average direct effects.
n0, n1the "proportions mediated", or the size of the average causal mediation effects relative to the total effect.
n0.ci, n1.ciconfidence intervals for the proportions mediated.
n0.sims, n1.simsvectors of length 'sims' containing simulation draws of the proportions mediated.
tau.coefpoint estimate for total effect.
tau.ciconfidence interval for total effect.
tau.simsa vector of length 'sims' containing simulation draws of the total effect.
d.avg, z.avg, n.avgsimple averages of d0 and d1, z0 and z1, n0 and n1, respectively, which users may want to use as summary values when those quantities differ.
d.avg.ci, z.avg.ci, n.avg.ciconfidence intervals for the above.
d.avg.sims, z.avg.sims, n.avg.simsvectors of length 'sims' containing simulation draws of d.avg, z.avg and n.avg, respectively.
bootlogical, the 'boot' argument used.
treata character string indicating the name of the 'treat' variable used.
mediatora character string indicating the name of the 'mediator' variable used.
INTa logical value indicating whether the model specification allows the effects to differ between the treatment and control conditions.
conf.levelthe confidence level used.
model.ythe outcome model used.
model.mthe mediator model used.
control.valuevalue of the treatment variable used as the control condition.
treat.valuevalue of the treatment variable used as the treatment condition.
nobsnumber of observations in the model frame for 'model.m' and 'model.y'. May differ from the numbers in the original models input to 'mediate' if 'dropobs' was 'TRUE'.

Details

This is the workhorse function for estimating causal mediation effects for a variety of data types. The average causal mediation effect (ACME) represents the expected difference in the potential outcome when the mediator took the value that would realize under the treatment condition as opposed to the control condition, while the treatment status itself is held constant. That is, $$\delta(t) \ = \ E{Y(t, M(t_1)) - Y(t, M(t_0))},$$ where $t, t_1, t_0$ are particular values of the treatment $T$ such that $t_1 \neq t_0$, $M(t)$ is the potential mediator, and $Y(t,m)$ is the potential outcome variable. The average direct effect (ADE) is defined similarly as, $$\zeta(t) \ = \ E{Y(t_1, M(t)) - Y(t_0, M(t))},$$ which represents the expected difference in the potential outcome when the treatment is changed but the mediator is held constant at the value that would realize if the treatment equals $t$. The two quantities on average add up to the total effect of the treatment on the outcome, $\tau$. See the references for more details. When both the mediator model ('model.m') and outcome model ('model.y') are linear, the results will be identical to the usual LSEM method by Baron and Kenny (1986). The function can, however, accommodate other data types including binary, ordered and count outcomes and mediators as well as censored outcomes. Variables can also be modeled nonparametrically, semiparametrically, or using quantile regressions. The prior weights in the mediator and outcome models are taken as sampling weights and the estimated effects will be weighted averages when non-NULL weights are used in fitting 'model.m' and 'model.y'. This will be useful when data does not come from a simple random sample, for example. As of version 3.0, the mediator model can be of either 'lm', 'glm', 'polr', 'gam', or 'rq' class, corresponding respectively to the linear regression models, generalized linear models, ordered response models, generalized additive models, or quantile regression models. For binary response models, the 'mediator' must be a numeric variable with values 0 or 1 as opposed to a factor. Quasi-likelihood-based inferences are not allowed for the mediator model because the functional form must be exactly specified for the estimation algorithm to work. The 'binomial' family can only be used for binary response mediators and cannot be used for multiple-trial responses. This is due to conflicts between how the latter type of models are implemented in glm and how 'mediate' is currently written. For the outcome model, the censored regression model fitted via package VGAM (of class 'vglm' with 'family@vfamily' equal to "tobit") can be used in addition to the models listed above for the mediator. The 'mediate' function is not compatible with censored regression models fitted via other packages. When the quantile regression is used for the outcome model ('rq'), the estimated quantities are quantile causal mediation effects, quantile direct effects and etc., instead of the average effects. The quasi-Bayesian approximation (King et al. 2000) cannot be used if 'model.m' is of class 'rq' or 'gam', or if 'model.y' is of class 'gam' or 'polr'. In these cases, either error is returned or nonparametric bootstrap is forced. Users should note that use of the nonparametric bootstrap often requires significant computing time, especially when 'sims' is set to a large value. The 'control' argument must be provided when 'gam' is used for the outcome model and user wants to allow ACME and ADE to vary as functions of $t$ (i.e., to relax the "no interaction" assumption). Note that the outcome model must be fitted via package mgcv with appropriate formula using s constructs (see Imai et al. 2009 in the references). For other model types, the interaction can be allowed by including an interaction term between $T$ and $M$ in the linear predictor of the outcome model. As of version 3.0, the 'INT' argument is deprecated and the existence of the interaction term is automatically detected (except for 'gam' outcome models). When the treatment variable is continuous, user must specify the values of $t_1$ and $t_0$ using the 'treat.value' and 'control.value' arguments, respectively. The value of $t$ in the above expressions is set to $t_0$ for 'd0', 'z0', etc. and to $t_1$ for 'd1', 'z1', etc.

References

Imai, K., Keele, L. and Tingley, D. (2010) A General Approach to Causal Mediation Analysis, Psychological Methods, Vol. 15, No. 4 (December), pp. 309-334. Imai, K., Keele, L. and Yamamoto, T. (2010) Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects, Statistical Science, Vol. 25, No. 1 (February), pp. 51-71. Imai, K., Keele, L., Tingley, D. and Yamamoto, T. (2009) "Causal Mediation Analysis Using R" in Advances in Social Science Research Using R, ed. H. D. Vinod New York: Springer.

Examples

Run this code

# Examples with JOBS II Field Experiment

# **For illustration purposes a small number of simulations are used**

data(jobs)

####################################################
# Example 1: Linear Outcome and Mediator Models
####################################################
b <- lm(job_seek ~ treat + econ_hard + sex + age, data=jobs)
c <- lm(depress2 ~ treat + job_seek + econ_hard + sex + age, data=jobs)

# Estimation via quasi-Bayesian approximation
contcont <- mediate(b, c, sims=50, treat="treat", mediator="job_seek")
summary(contcont)
plot(contcont)

# Estimation via nonparametric bootstrap
contcont.boot <- mediate(b, c, boot=TRUE, sims=50, treat="treat", mediator="job_seek")
summary(contcont.boot)

# Allowing treatment-mediator interaction
d <- lm(depress2 ~ treat + job_seek + treat:job_seek + econ_hard + sex + age, data=jobs)
contcont.int <- mediate(b, d, sims=50, treat="treat", mediator="job_seek")
summary(contcont.int)

# Continuous treatment
jobs$treat_cont <- jobs$treat + rnorm(nrow(jobs))  # (hypothetical) continuous treatment
b.contT <- lm(job_seek ~ treat_cont + econ_hard + sex + age, data=jobs)
c.contT <- lm(depress2 ~ treat_cont + job_seek + econ_hard + sex + age, data=jobs)
contcont.cont <- mediate(b.contT, c.contT, sims=50, 
                    treat="treat_cont", mediator="job_seek",
                    treat.value = 4, control.value = -2)
summary(contcont.cont)

######################################################
# Example 2: Binary Outcome and Ordered Mediator
######################################################
b.ord <- polr(job_disc ~ treat + econ_hard + sex + age, data=jobs,
            method="probit", Hess=TRUE)
d.bin <- glm(work1 ~ treat * job_disc + econ_hard + sex + age, data=jobs,
            family=binomial(link="probit"))
ordbin <- mediate(b.ord, d.bin, sims=50, treat="treat", mediator="job_disc")
summary(ordbin)

# Using heteroskedasticity-consistent standard errors
require(sandwich)
ordbin.rb <- mediate(b.ord, d.bin, sims=50, treat="treat", mediator="job_disc",
            robustSE=TRUE)
summary(ordbin.rb)

######################################################
# Example 3: Quantile Causal Mediation Effect
######################################################
require(quantreg)
c.quan <- rq(depress2 ~ treat + job_seek + econ_hard + sex + age, data=jobs,
            tau = 0.5)  # median
contquan <- mediate(b, c.quan, sims=50, treat="treat", mediator="job_seek")
summary(contquan)

######################################################
# Example 4: GAM Outcome
######################################################
require(mgcv)
c.gam <- gam(depress2 ~ treat + s(job_seek, bs="cr") + 
            econ_hard + sex + age, data=jobs)
contgam <- mediate(b, c.gam, sims=10, treat="treat", 
                mediator="job_seek", boot=TRUE)
summary(contgam)

# With interaction
d.gam <- gam(depress2 ~ treat + s(job_seek, by = treat) + 
    s(job_seek, by = control) + econ_hard + sex + age, data=jobs)
contgam.int <- mediate(b, d.gam, sims=10, treat="treat", mediator="job_seek",
    control = "control", boot=TRUE)
summary(contgam.int)

Run the code above in your browser using DataLab