mmi: Multiple-Mediator-Imputation Estimation Method

Description

'mmi' is used to estimate the initial disparity, disparity reduction, and disparity remaining for causal decomposition analysis, using the multiple-mediator-imputation estimation method proposed by Park et al. (2020). This estimator was originally developed to handle multiple mediators simultaneously; however, it can also be applied to a single mediator.

Usage

mmi(fit.r = NULL, fit.x, fit.y, group, covariates, sims = 100, conf.level = .95,
    conditional = TRUE, cluster = NULL, long = TRUE, mc.cores = 1L, seed = NULL)

Value

result: a matrix containing the point estimates of the initial disparity, disparity remaining, and disparity reduction, and the percentile bootstrap confidence intervals for each estimate.
all.result: a matrix containing the point estimates of the initial disparity, disparity remaining, and disparity reduction for all bootstrap samples. Returned if 'long' is 'TRUE'.

Arguments

fit.r: a fitted model object for social group indicator (treatment). Can be of class 'CBPS' or 'SumStat'. Default is 'NULL'. Only necessary if 'conditional' is 'FALSE'.
fit.x: a fitted model object for intermediate confounder(s). Each intermediate model can be of class 'lm', 'glm', 'multinom', or 'polr'. When multiple confounders are considered, can be of class 'list' containing multiple models.
fit.y: a fitted model object for outcome. Can be of class 'lm' or 'glm'.
group: a character string indicating the name of the social group indicator such as race or gender (treatment) used in the models. The social group indicator can be categorical with two or more categories (two- or multi-valued factor).
covariates: a vector containing the name of the covariate variable(s) used in the models. Each covariate can be categorical with two or more categories (two- or multi-valued factor) or continuous (numeric).
sims: number of Monte Carlo draws for nonparametric bootstrap.
conf.level: level of the returned two-sided confidence intervals, which are estimated by the nonparametric percentile bootstrap method. Default is .95, which returns the 2.5 and 97.5 percentiles of the simulated quantities.
conditional: a logical value. If 'TRUE', the function will return the estimates conditional on those covariate values, and all covariates in mediator and outcome models need to be centered prior to fitting. Default is 'TRUE'. If 'FALSE', 'fit.r' needs to be specified.
cluster: a vector of cluster indicators for the bootstrap. If provided, the cluster bootstrap is used. Default is 'NULL'.
long: a logical value. If 'TRUE', the output will contain the entire sets of estimates for all bootstrap samples. Default is 'TRUE'.
mc.cores: The number of cores to use. Must be exactly 1 on Windows.
seed: seed number for the reproducibility of results. Default is `NULL'.

Author

Suyeon Kang, University of Central Florida, suyeon.kang@ucf.edu; Soojin Park, University of California, Riverside, soojinp@ucr.edu.

Details

This function returns the point estimates of the initial disparity, disparity reduction, and disparity remaining for a categorical social group indicator and a variety of types of outcome and mediator(s) in causal decomposition analysis. It also returns nonparametric percentile bootstrap confidence intervals for each estimate.

The initial disparity between two groups is defined as \(\tau(r,0) \equiv E[Y|R=r,c]-E[Y|R=0,c]\), for \(c \in \mathcal{C}\) and \(r \mathcal{R}\), where \(R=0\) denotes the reference group and \(R=r\) is the comparison group.

The disparity reduction, conditional on baseline covariates, measures how much the initial disparity would shrink if we simultaneously equalized the distribution of the mediators W and M across groups within each level of \(C\). Formally, \(\delta(1)\equiv E[Y|R=1,c]-E[Y(G_{w|c}(0)G_{m|c}(0))|R=1,c]\), where \(G_{m|c}(0)\) and \(G_{w|c}(0)\) are a random draw from the reference group’s mediator \(M\) and \(W\) distribution given \(C\), respectively.

The remaining disparity, also conditional on \(C\), is the difference between the reference group’s observed outcome and the counterfactual outcome for the comparison group after equalizing \(M\). Formally, \(\zeta(0) \equiv E[Y(Y(G_{w|c}(0)G_{m|c}(0)))|R=1, c]-E[Y|R=0, c]\).

The disparity reduction and remaining can be estimated using the multiple-mediator-imputation method suggested by Park et al. (2020). See the references for more details.

If one wants to make the inference conditional on baseline covariates, set 'conditional = TRUE' and center the data before fitting the models.

As of version 0.1.0, the intetmediate confounder model ('fit.x') can be of class 'lm', 'glm', 'multinom', or 'polr', corresponding respectively to the linear regression models and generalized linear models, multinomial log-linear models, and ordered response models. The outcome model ('fit.y') can be of class 'lm' or 'glm'. Also, the social group model ('fit.r') can be of class 'CBPS' or 'SumStat', both of which use the propensity score weighting. It is only necessary when 'conditional = FALSE'.

References

Park, S., Qin, X., & Lee, C. (2022). "Estimation and sensitivity analysis for causal decomposition in health disparity research". Sociological Methods & Research, 53(2), 571-602.

Park, S., Kang, S., & Lee, C. (2023). "Choosing an Optimal Method for Causal Decomposition Analysis with Continuous Outcomes: A Review and Simulation Study". Sociological Methodology, 54(1), 92-117.

Examples

Run this code

data(sdata)

#------------------------------------------------------------------------------#
# Example 1-a: Continuous Outcome
#------------------------------------------------------------------------------#
fit.m1 <- lm(M.num ~ R + C.num + C.bin, data = sdata)
fit.m2 <- glm(M.bin ~ R + C.num + C.bin, data = sdata,
          family = binomial(link = "logit"))
require(MASS)
fit.m3 <- polr(M.cat ~ R + C.num + C.bin, data = sdata)
fit.x1 <- lm(X ~ R + C.num + C.bin, data = sdata)
require(nnet)
fit.m4 <- multinom(M.cat ~ R + C.num + C.bin, data = sdata)
fit.y1 <- lm(Y.num ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata)

require(PSweight)
fit.r1 <- SumStat(R ~ C.num + C.bin, data = sdata, weight = "IPW")
require(CBPS)
fit.r2 <- CBPS(R ~ C.num + C.bin, data = sdata, method = "exact",
          standardize = "TRUE")

res.1a <- mmi(fit.r = fit.r1, fit.x = fit.x1,
          fit.y = fit.y1, sims = 40, conditional = FALSE,
          covariates = c("C.num", "C.bin"), group = "R", seed = 111)
res.1a

#------------------------------------------------------------------------------#
# Example 1-b: Binary Outcome
#------------------------------------------------------------------------------#
fit.y2 <- glm(Y.bin ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata, family = binomial(link = "logit"))

res.1b <- mmi(fit.r = fit.r1, fit.x = fit.x1,
          fit.y = fit.y2, sims = 40, conditional = FALSE,
          covariates = c("C.num", "C.bin"), group = "R", seed = 111)
res.1b

#------------------------------------------------------------------------------#
# Example 2-a: Continuous Outcome, Conditional on Covariates
#------------------------------------------------------------------------------#
# For conditional = TRUE, need to create data with centered covariates
# copy data
sdata.c <- sdata
# center quantitative covariate(s)
sdata.c$C.num <- scale(sdata.c$C.num, center = TRUE, scale = FALSE)
# center binary (or categorical) covariates(s)
# only neccessary if the desired baseline level is NOT the default baseline level.
sdata.c$C.bin <- relevel(sdata.c$C.bin, ref = "1")

# fit mediator and outcome models
fit.m1 <- lm(M.num ~ R + C.num + C.bin, data = sdata.c)
fit.m2 <- glm(M.bin ~ R + C.num + C.bin, data = sdata.c,
          family = binomial(link = "logit"))
fit.m3 <- polr(M.cat ~ R + C.num + C.bin, data = sdata.c)
fit.x2 <- lm(X ~ R + C.num + C.bin, data = sdata.c)
fit.y1 <- lm(Y.num ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata.c)

res.2a <- mmi(fit.x = fit.x2,
          fit.y = fit.y1, sims = 40, conditional = TRUE,
          covariates = c("C.num", "C.bin"), group = "R", seed = 111)
res.2a

#------------------------------------------------------------------------------#
# Example 2-b: Binary Outcome, Conditional on Covariates
#------------------------------------------------------------------------------#
fit.y2 <- glm(Y.bin ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata.c, family = binomial(link = "logit"))

res.2b <- mmi(fit.x = fit.x2,
          fit.y = fit.y2, sims = 40, conditional = TRUE,
          covariates = c("C.num", "C.bin"), group = "R", seed = 111)
res.2b

#------------------------------------------------------------------------------#
# Example 3: Case with Multiple Intermediate Confounders
#------------------------------------------------------------------------------#
fit.r <- SumStat(R ~ C, data = idata, weight = "IPW")
fit.x1 <- lm(X1 ~ R + C, data = idata)
fit.x2 <- lm(X2 ~ R + C, data = idata)
fit.x3 <- lm(X3 ~ R + C, data = idata)
fit.y <- lm(Y ~ R + M + X1 + X2 + X3 + C, data = idata)

res.3 <- mmi(fit.r = fit.r, fit.x = list(fit.x1, fit.x2, fit.x3),
         fit.y = fit.y, sims = 40, conditional = FALSE,
         covariates = "C", group = "R", seed = 111)
         res.3

Run the code above in your browser using DataLab