50% off | Unlimited Data & AI Learning
Get 50% off unlimited learning

adapt4pv (version 0.2-3)

lasso_cv: wrap function for cv.glmnet

Description

Fit a first cross-validation on lasso regression and return selected covariates. Can deal with very large sparse data matrices. Intended for binary reponse only (option family = "binomial" is forced). Depends on the cv.glmnet function from the package glmnet.

Usage

lasso_cv(x, y, nfolds = 5, foldid = NULL, betaPos = TRUE, ...)

Value

An object with S3 class "log.lasso".

beta

Numeric vector of regression coefficients in the lasso. In lasso_cv function, the regression coefficients are PENALIZED. Length equal to nvars.

selected_variables

Character vector, names of variable(s) selected with the lasso-cv approach. If betaPos = TRUE, this set is the covariates with a positive regression coefficient in beta. Else this set is the covariates with a non null regression coefficient in beta. Covariates are ordering according to magnitude of their regression coefficients absolute value.

Arguments

x

Input matrix, of dimension nobs x nvars. Each row is an observation vector. Can be in sparse matrix format (inherit from class "sparseMatrix" as in package Matrix).

y

Binary response variable, numeric.

nfolds

Number of folds - default is 5. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3.

foldid

An optional vector of values between 1 and nfolds identifying what fold each observation is in. If supplied, nfolds can be missing.

betaPos

Should the covariates selected by the procedure be positively associated with the outcome ? Default is TRUE.

...

Other arguments that can be passed to cv.glmnet from package glmnet other than nfolds, foldid, and family.

Author

Emeline Courtois
Maintainer: Emeline Courtois emeline.courtois@inserm.fr

Examples

Run this code

set.seed(15)
drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
lcv <- lasso_cv(x = drugs, y = ae, nfolds = 3)


Run the code above in your browser using DataLab