MI_LASSO: Multiple-Imputation LASSO (MI-LASSO)

Description

Fit a LASSO-like penalty across D multiply-imputed datasets by iteratively reweighted ridge regressions (Equation (4) of the manuscript). For each tuning parameter in lamvec, it returns the pooled coefficient estimates, the BIC, and the selected variables.

Usage

MI_LASSO(
  X,
  Y,
  lamvec = (2^(seq(-1, 4, by = 0.05)))^2/2,
  maxiter = 200,
  eps = 1e-20,
  ncores = 1
)

Value

If length(lamvec) > 1, a list with elements:

best: List for the \(lambda\) with minimal BIC containing: coefficients ((p+1)×D intercept + slopes), bic (BIC scalar), varsel (logical length-p vector of selected predictors), lambda (the chosen penalty).
lambda_path: length(lamvec)×2 matrix of each lambda and its corresponding BIC.

If length(lamvec) == 1, returns a single list (as above) for that penalty.

Arguments

X: A matrix n×p or an array D×n×p of imputed predictor sets. If a matrix is supplied, it is treated as a single imputation (D = 1).
Y: A vector length n or a D×n matrix of outcomes. If a vector, it is reused across imputations.
lamvec: Numeric vector of penalty parameters \(\lambda\) to search. Default (2^(seq(-1,4,by=0.05)))^2/2.
maxiter: Integer; maximum number of ridge–update iterations per lambda. Default 200.
eps: Numeric; convergence tolerance on coefficient change. Default 1e-20.
ncores: Integer; number of cores for parallelizing over lamvec. Default 1.

Examples

Run this code

sim <- sim_A(n = 100, p = 20, type = "MAR", SNP = 1.5, low_missing = TRUE, n_imp = 5, seed = 123)
X <- sim$data_MI$X
Y <- sim$data_MI$Y
fit <- MI_LASSO(X, Y, lamvec = c(0.1))

Run the code above in your browser using DataLab