Fit a Bayesian multiple-imputation LASSO (BMI-LASSO) model across multiply-imputed datasets, using one of four priors: Multi-Laplace, Horseshoe, ARD, or Spike-Laplace. Automatically standardizes data, runs MCMC in parallel, performs variable selection via three-step projection predictive variable selection, and selects a final submodel by BIC.
BMI_LASSO(
X,
Y,
model,
standardize = TRUE,
SNC = TRUE,
grid = seq(0, 1, 0.01),
orthogonal = FALSE,
nburn = 4000,
npost = 4000,
seed = NULL,
nchain = 1,
ncores = 1,
verbose = TRUE,
printevery = 1000,
...
)
A named list with elements:
posterior
List of length nchain
of MCMC outputs (posterior draws).
select
List of length nchain
of logical matrices showing
which variables are selected at each grid value.
best_select
List of length nchain
of the single best
selection (by BIC) for each chain.
posterior_best_models
List of length nchain
of projected
posterior draws for the best submodel.
bic_models
List of length nchain
of BIC values and
degrees-of-freedom for each candidate submodel.
summary_table_full
A data frame summarizing rank-normalized split-Rhat and other diagnostics for the full model.
summary_table_selected
A data frame summarizing diagnostics for the selected submodel after projection.
A numeric matrix or array of predictors. If a matrix n × p
,
it is taken as one imputation; if an array D × n × p
, each slice
along the first dimension is one imputed dataset.
A numeric vector or matrix of outcomes. If a vector of length n
,
it is recycled for each imputation; if a D × n
matrix, each row
is the response for one imputation.
Character; which prior to use. One of "Multi_Laplace"
,
"Horseshoe"
, "ARD"
, or "Spike_Laplace"
.
Logical; whether to normalize each X
and centralize
Y
within each imputation before fitting. Default TRUE
.
Logical; if TRUE
, use scaled neighborhood criterion;
otherwise apply thresholding or median‐based selection. Default TRUE
.
Numeric vector; grid of scaled neighborhood criterion (or thresholding) to explore.
Default seq(0,1,0.01)
.
Logical; if TRUE
, using orthogonal approximations for
degrees‐of‐freedom estimations. Default FALSE
.
Integer; number of burn-in MCMC iterations per chain. Default 4000
.
Integer; number of post-burn-in samples to retain per chain. Default 4000
.
Optional integer; base random seed. Each chain adds its index.
Integer; number of MCMC chains to run in parallel. Default 1
.
Integer; number of parallel cores to use. Default 1
.
Logical; print progress messages. Default TRUE
.
Integer; print status every so many iterations. Default 1000
.
Additional model-specific hyperparameters:
For "Multi_Laplace"
: h
(shape) and v
(scale) of Gamma hyperprior.
For "Spike_Laplace"
: a
(shape) and b
(scale) of Gamma hyperprior.
sim <- sim_A(n = 100, p = 20, type = "MAR", SNP = 1.5, low_missing = TRUE, n_imp = 5, seed = 123)
X <- sim$data_MI$X
Y <- sim$data_MI$Y
fit <- BMI_LASSO(X, Y, model = "Horseshoe",
nburn = 100, npost = 100,
nchain = 1, ncores = 1)
str(fit$best_select)
Run the code above in your browser using DataLab