mediate_hdma
fits a high-dimensional mediation model with
the de-biased LASSO approach as proposed by Gao et al. (2022),
estimating the mediation contributions of potential mediators.
mediate_hdma(
A,
M,
Y,
C1 = NULL,
C2 = NULL,
binary_y = FALSE,
n_include = NULL,
...
)
A list containing:
contributions: a data frame containing the estimates and p-values of the mediation contributions
effects: a data frame containing the estimated direct, global mediation, and total effects
length n
numeric vector containing exposure variable
n x p
numeric matrix of high-dimensional mediators.
length n
numeric vector containing continuous or binary outcome variable.
optional numeric matrix of covariates to include in the outcome model.
optional numeric matrix of covariates to include in the mediator model.
logical flag for whether Y
should be interpreted as a
binary variable with 1/0 coding rather than as continuous. Default is FALSE
.
integer specifying the number of top markers from sure
independent screening to be included. Default is NULL
, in which case
n_include
will be either ceiling(n/log(n))
if
binary_Y = F
, or ceiling(n/(2*log(n)))
if binary_Y = T
.
If n_include >= p
, all mediators are included with no screening. Note
that if binary_y = F
, screening is performed based on the single-mediator
outcome model p-values, and if binary_y = F
, screening is based on the
the mediator model p-values.
other arguments passed to hdi::hdi()
.
The first step in HDMA is to perform sure independence
screening (SIS) to choose the n_include
mediators that are most
associated with the outcome (when Y is continuous) or the exposure
(when Y is binary), based on p-values from linear regression. The second step
is to fit the outcome model for the remaining mediators using de-sparsified
(A.K.A de-biased) LASSO, which as asymptotic properties allowing for
computation of p-values by the hdi
package. HDMA then fits the
mediator models using linear regression among those mediators that have both
survived SIS (in step 1) and been identified by the LASSO (in step 2), obtaining
p-values for the mediation contributions by taking the maximum of the \(\alpha_a\)
and \(\beta_m\) p-values. The global indirect effect is estimated by summing the
mediation contributions, and the direct effect is estimated by subtracting
the global indirect effect from an estimate of the total effect. See References for
more detail.
Gao, Y. et al. Testing Mediation Effects in High-Dimensional Epigenetic Studies. Front. Genet. 10, 1195 (2019).
Fan, J. & Lv, J. Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. 70, 849-911 (2008)
A <- med_dat$A
M <- med_dat$M
Y <- med_dat$Y
# Fit hdma with continuous outcomes
out <- mediate_hdma(A, M, Y)
head(out$contributions)
out$effects
Run the code above in your browser using DataLab