mediate_spcma: Sparse Principal Component Mediation Analysis for High-Dimensional Mediators

Description

mediate_spcma applies sparse principal component mediation analysis to mediation settings in which the mediators are high-dimensional.

Usage

mediate_spcma(
  A,
  M,
  Y,
  var_per = 0.8,
  n_pc = NULL,
  sims = 1000,
  boot_ci_type = "bca",
  ci_level = 0.95,
  fused = FALSE,
  gamma = 0,
  per_jump = 0.7,
  eps = 1e-04,
  maxsteps = 2000,
  seed = 1
)

Value

A list containing:

loadings: a matrix of the sparse PC loadings.
pcs: a matrix of the PCs.
var_explained: the cumulative proportion of variance explained by the PCs.
contributions: a data frame containing the estimates, confidence intervals, and p-values of the mediation contributions.
effects: a data frame containing the estimated direct, global mediation, and total effects

Arguments

A: length n numeric vector containing exposure variable
M: n x p numeric matrix of high-dimensional mediators.
Y: length n numeric vector containing continuous outcome variable.
var_per: a numeric variable with the desired proportion of variance explained. Default is 0.8.
n_pc: optional numeric variable with the desired number of PCs, in which case var_per is ignored. Default is NULL and the number of PCs is determined based on the desired proportion of variance explained.
sims: number of Monte Carlo draws for nonparametric bootstrap or quasi-Bayesian approximation (see mediation::mediate()). Default is 1000.
boot_ci_type: character string indicating the type of bootstrap confidence intervals for when boot = TRUE. If "bca", bias-corrected and accelerated (BCa) confidence intervals will be estimated. If "perc", percentile confidence intervals will be estimated (see mediation::mediate()). Default is "bca".
ci_level: the designated confidence level. Default 0.95.
fused: logical variable for whether the fused LASSO should be used instead of the ordinary LASSO. Default is FALSE.
gamma: numeric variable >=0 indicating the ratio of the standard LASSO penalty to the fusion penalty (see genlasso::genlasso()). Ignored if fused = FALSE. Default is 0, meaning there is no standard penalty. Larger values result in more shrinkage and sparser PCs.
per_jump: numeric value used for tuning parameter selection - the quantile cut-off for total variance change under different lambda values in the LASSO. Default is 0.7. Larger values result in more shrinkage and sparser PCs
eps: numeric variable indicating the multiplier for the ridge penalty in case X is rank deficient (see genlasso::genlasso()). Default is 1e-4.
maxsteps: an integer specifying the maximum number of steps for the algorithm before termination (see genlasso::genlasso()). Default is 2000.
seed: seed used for fitting single-mediator models after PCA

Details

mediate_spcma performs principal component mediation analysis, comparable to mediate_pcma, with the modification that the PC loadings are sparsified by a flexible LASSO penalty. This has the potential make the PCs more interpretable, since, unlike in PCA, they are only linear combinations of a subset of mediators rather than all of them. The choice of LASSO penalties is determined by the fused argument - which, when set to TRUE, deploys a fused LASSO penalty that encourages the model to give consecutive mediators similar loadings. The default is fused = FALSE, and the standard LASSO penalty is used instead of the fusion penalty. Once the sparse PCs are computed, inference proceeds exactly like in PCMA, and the PC-mediators are evaluated with methods from the mediate package.

References

Zhao, Y., Lindquist, M. A. & Caffo, B. S. Sparse principal component based high-dimensional mediation analysis. Comput. Stat. Data Anal. 142, 106835 (2020).

Examples

Run this code

A <- med_dat$A
M <- med_dat$M
Y <- med_dat$Y

# Fit SPCMA with the fused LASSO penalty while choosing the number of PCs based
# on the variance they explain. In practice, var_per and sims should be higher.
out <- mediate_spcma(A, M, Y, var_per = 0.25, fused = TRUE, gamma = 2, sims = 10)
out$effects

Run the code above in your browser using DataLab