DfiMI_lasso: Distributed Full-information Multiple Imputation (DfiMI) using LASSO
Description
Performs multiple imputation of the response variable Y via R independent
runs and M stochastic imputations per run. Missing Y values are imputed
using LASSO regression on predictors.
Usage
DfiMI_lasso(data, R, M)
Value
A named list containing:
Yhat
Numeric vector -- original Y values with missing values replaced by imputations.
betahat
Numeric vector -- final regression coefficients.
Arguments
data
A data.frame where:
First column:
Response Y (may contain NA)
Remaining columns:
Numeric predictors
R
Positive integer -- number of simulation runs for stable coefficient estimation.
M
Positive integer -- number of multiple imputations per run.
Details
This function extends the Distributed Full-information Multiple Imputation (DfiMI) approach
by using LASSO regression for imputing missing values in the response variable Y.
LASSO regression is particularly useful for high-dimensional predictor spaces and can
handle multicollinearity among predictors. The function performs the following steps:
Initialize missing values in Y.
Fit LASSO regression models on complete cases.
Average coefficients across multiple imputations and runs.
Predict missing values using the final averaged coefficients.
The function requires the glmnet package for LASSO regression.