mediate_hima
fits a high-dimensional mediation model with
the minimax concave penalty as proposed by Zhang et al. (2016),
estimating the mediation contributions of potential mediators.
mediate_hima(
A,
M,
Y,
C1 = NULL,
C2 = NULL,
binary_y = FALSE,
n_include = NULL,
...
)
A list containing:
contributions: a data frame containing the estimates and p-values of the mediation contributions
effects: a data frame containing the estimated direct, global mediation, and total effects
length n
numeric vector containing exposure variable
n x p
numeric matrix of high-dimensional mediators.
length n
numeric vector containing continuous or binary outcome variable.
optional numeric matrix of covariates to include in the outcome model.
optional numeric matrix of covariates to include in the mediator model.
logical flag for whether Y
should be interpreted as a
binary variable with 1/0 coding rather than as continuous. Default is FALSE
.
integer specifying the number of top markers from sure
independent screening to be included. Default is NULL
, in which case
n_include
will be either ceiling(n/log(n))
if
binary_Y = F
, or ceiling(n/(2*log(n)))
if binary_Y = T
.
If n_include >= p
, all mediators are included with no screening. Note
that if binary_y = F
, screening is performed based on the single-mediator
outcome model p-values, and if binary_y = F
, screening is based on the
the mediator model p-values.
other arguments passed to hdi
.
The first step in HIMA is to perform sure independence
screening (SIS) to choose the n_include
mediators that are most
associated with the outcome (when Y is continuous) or the exposure
(when Y is binary), based on p-values from linear regression. The second step
is to fit the outcome model for the remaining mediators with the minimax
concave penalty. HIMA then fits the mediator models using linear regression
among those mediators that have both survived SIS (in step 1) and been
selected by the MCP (in step 2), which enables estimation of the mediation
contributions. The global indirect effect is estimated by summing these
contributions, and the direct effect is estimated by subtracting the global
indirect effect from an estimate of the total effect. We compute p-values for
the mediation contributions by taking the maximum of the \(\alpha_a\) and \(\beta_m\)
p-values, where the beta p-values are obtained via a second, unpenalized generalized
linear model containing only the mediators selected by the MCP. We include this
p-value computation so that our function replicates the behavior of the
HIMA
function from HIMA
package, the function on which ours is based, but we caution that the beta
p-values may be over-optimistic due to double-dipping, since the mediators tested in
the unpenalized model are only those chosen by the penalized model. Note also
that the HIMA authors apply Bonferroni correction to the final, maxed p-values
to account for multiple testing, which we choose to leave up to the user. For
more information, see the "HIMA" R package along with the provided reference.
Zhang, H. et al. Estimating and testing high-dimensional mediation effects in epigenetic studies. Bioinformatics 32, 3150-3154 (2016).
A <- med_dat$A
M <- med_dat$M
Y <- med_dat$Y
# Fit hima with continuous outcome
out <- mediate_hima(A, M, Y)
head(out$contributions)
out$effects
# Fit hima with binary outcome
Y1 <- as.numeric(Y > mean(Y))
out1 <- mediate_hima(A, M, Y1, binary_y = TRUE)
head(out1$contributions)
out1$effects
Run the code above in your browser using DataLab