Implementation of CISL and the stability selection according to subsampling options.
cisl(
x,
y,
r = 4,
nB = 100,
dfmax = 50,
nlambda = 250,
nMin = 0,
replace = TRUE,
betaPos = TRUE,
ncore = 1
)
Input matrix, of dimension nobs x nvars. Each row is an
observation vector. Can be in sparse matrix format (inherit from class
"sparseMatrix"
as in package Matrix
).
Binary response variable, numeric.
Number of control in the CISL sampling. Default is 4. See details below for other implementations.
Number of sub-samples. Default is 100.
Corresponds to the maximum size of the models visited with the lasso (E in the paper). Default is 50.
Number of lambda values as is glmnet
documentation.
Default is 250.
Minimum number of events for a covariate to be considered.
Default is 0, all the covariates from x
are considered.
Should sampling be with replacement? Default is TRUE.
If betaPos=TRUE
, variable selection is based on positive
regression coefficient.
Else, variable selection is based on non-zero regression coefficient.
Default is TRUE.
The number of calcul units used for parallel computing.
This has to be set to 1 if the parallel
package is not available.
Default is 1.
WARNING: parallel computing is not supported for windows machines!
An object with S3 class "cisl"
.
Matrix of dimension nvars x nB
.
Quantity compute by CISL for each covariate, for each subsample.
5 \(\%\) quantile of the CISL quantity for each covariates. Numeric, length equal to nvars.
10 \(\%\) quantile of the CISL quantity for each covariates. Numeric, length equal to nvars.
15 \(\%\) quantile of the CISL quantity for each covariates. Numeric, length equal to nvars.
20 \(\%\) quantile of the CISL quantity for each covariates. Numeric, length equal to nvars.
CISL is a variation of the stability method adapted to characteristics of pharmacovigilance databases.
Tunning r = 4
and replace = TRUE
are used to implement our CISL sampling.
For instance, r = NULL
and replace = FALSE
can be used to
implement the \(n \over 2\) sampling in Stability Selection.
Ahmed, I., Pariente, A., & Tubert-Bitter, P. (2018). "Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions". Statistical Methods in Medical Research. 27(3), 785<U+2013>797, 10.1177/0962280216643116
# NOT RUN {
set.seed(15)
drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
lcisl <- cisl(x = drugs, y = ae, nB = 50)
# }
Run the code above in your browser using DataLab