This function performs feature selection using various methods such as LASSO, Elastic Net, Ridge regression, and Boruta. It outputs selected features and variable importance plots.
perform_feature_selection(
group_info,
features,
id_col_group,
id_col_features,
group_col,
outlier_col = NULL,
outlier_vals = c("No"),
group_vals = c("No", "Yes"),
method = c("lasso", "elastic_net", "ridge", "boruta"),
mixture = 0.5,
penalty_vals = 50,
seed = 1234,
output_dir = "output"
)A tibble containing selected features and variable importances.
A data frame containing group information.
A data frame containing feature data.
Column name in `group_info` to join with `features`.
Column name in `features` to join with `group_info`.
Column name indicating the group information.
(Optional) Column name for identifying outliers.
(Optional) Values indicating non-outliers.
A vector of length 2 indicating the values for group comparison.
The feature selection method to use: "lasso", "elastic_net", "ridge", "boruta".
(Optional) The mixture parameter for Elastic Net, default is 0.5.
(Optional) Number of penalty values to try for tuning, default is 50.
(Optional) Random seed for reproducibility, default is 1234.
(Optional) Directory to save output files, default is "output".