Learn R Programming

lares (version 4.8.4)

lasso_vars: Most Relevant Features Using Lasso Regression

Description

Use Lasso regression to identify the most relevant variables that can predict/identify another variable. You might want to compare with corr_var() results to compliment the analysis No need to standardize, center or scale your data. Tidyverse friendly.

Usage

lasso_vars(
  df,
  variable,
  ignore = NA,
  nlambdas = 100,
  nfolds = 10,
  seed = 123,
  ...
)

Arguments

df

Dataframe. Any dataframe is valid as `ohse` will be applied to process categorical values, and values will be standardize automatically.

variable

Variable.

ignore

Character vector. Variables to exclude from study.

nlambdas

Integer. Number of lambdas to be used in a search.

nfolds

Integer. Number of folds for K-fold cross-validation (>= 2).

seed

Numeric.

...

ohse parameters.

See Also

Other Machine Learning: ROC(), clusterKmeans(), conf_mat(), export_results(), gain_lift(), h2o_automl(), h2o_predict_API(), h2o_predict_MOJO(), h2o_predict_binary(), h2o_predict_model(), h2o_results(), h2o_selectmodel(), impute(), iter_seeds(), model_metrics(), msplit()

Other Exploratory: corr_cross(), corr_var(), crosstab(), df_str(), distr(), freqs_df(), freqs_list(), freqs_plot(), freqs(), missingness(), plot_cats(), plot_df(), plot_nums(), summer(), tree_var(), trendsRelated()

Examples

Run this code
# NOT RUN {
options("lares.font" = NA) # Temporal
data(dft) # Titanic dataset

m <- lasso_vars(dft, Survived, ignore = c("Cabin"))
print(m$coef)
print(m$metrics)
m$plot
# }

Run the code above in your browser using DataLab