Learn R Programming

hdmed (version 1.0.0)

mediate_hdma: High-Dimensional Mediation Analysis

Description

mediate_hdma fits a high-dimensional mediation model with the de-biased LASSO approach as proposed by Gao et al. (2022), estimating the mediation contributions of potential mediators.

Usage

mediate_hdma(
  A,
  M,
  Y,
  C1 = NULL,
  C2 = NULL,
  binary_y = FALSE,
  n_include = NULL,
  ...
)

Value

A list containing:

  • contributions: a data frame containing the estimates and p-values of the mediation contributions

  • effects: a data frame containing the estimated direct, global mediation, and total effects

Arguments

A

length n numeric vector containing exposure variable

M

n x p numeric matrix of high-dimensional mediators.

Y

length n numeric vector containing continuous or binary outcome variable.

C1

optional numeric matrix of covariates to include in the outcome model.

C2

optional numeric matrix of covariates to include in the mediator model.

binary_y

logical flag for whether Y should be interpreted as a binary variable with 1/0 coding rather than as continuous. Default is FALSE.

n_include

integer specifying the number of top markers from sure independent screening to be included. Default is NULL, in which case n_include will be either ceiling(n/log(n)) if binary_Y = F, or ceiling(n/(2*log(n))) if binary_Y = T. If n_include >= p, all mediators are included with no screening. Note that if binary_y = F, screening is performed based on the single-mediator outcome model p-values, and if binary_y = F, screening is based on the the mediator model p-values.

...

other arguments passed to hdi::hdi().

Details

The first step in HDMA is to perform sure independence screening (SIS) to choose the n_include mediators that are most associated with the outcome (when Y is continuous) or the exposure (when Y is binary), based on p-values from linear regression. The second step is to fit the outcome model for the remaining mediators using de-sparsified (A.K.A de-biased) LASSO, which as asymptotic properties allowing for computation of p-values by the hdi package. HDMA then fits the mediator models using linear regression among those mediators that have both survived SIS (in step 1) and been identified by the LASSO (in step 2), obtaining p-values for the mediation contributions by taking the maximum of the \(\alpha_a\) and \(\beta_m\) p-values. The global indirect effect is estimated by summing the mediation contributions, and the direct effect is estimated by subtracting the global indirect effect from an estimate of the total effect. See References for more detail.

References

Gao, Y. et al. Testing Mediation Effects in High-Dimensional Epigenetic Studies. Front. Genet. 10, 1195 (2019).

Fan, J. & Lv, J. Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. 70, 849-911 (2008)

Examples

Run this code
A <- med_dat$A
M <- med_dat$M
Y <- med_dat$Y

# Fit hdma with continuous outcomes
out <- mediate_hdma(A, M, Y)
head(out$contributions)
out$effects

Run the code above in your browser using DataLab