clogitLasso: fit lasso for conditional logistic regression for matched case-control studies

Description

Fit a sequence of conditional logistic regression with lasso penalty, for small to large sized samples

Usage

clogitLasso(X, y, strata, fraction = NULL, nbfraction = 100,
  nopenalize = NULL, BACK = TRUE, standardize = FALSE, maxit = 100,
  maxitB = 500, thr = 1e-10, tol = 1e-10, epsilon = 1e-04,
  trace = TRUE, log = TRUE, adaptive = FALSE, separate = FALSE,
  ols = FALSE, p.fact = NULL, remove = FALSE)

Arguments

Input matrix, of dimension nobs x nvars; each row is an observation vector

Binary response variable, with 1 for cases and 0 for controls

strata

Vector with stratum membership of each observation

fraction

Sequence of lambda values

nbfraction

The number of lambda values - default is 100

nopenalize

List of coefficients not to penalize starting at 0

BACK

If TRUE, use Backtracking-line search -default is TRUE

standardize

Logical flag for x variable standardization, prior to fitting the model sequence.

maxit

Maximum number of iterations of outer loop - default is 100

maxitB

Maximum number of iterations in Backtracking-line search - default is 100

thr

Threshold for convergence in lassoshooting. Default value is 1e-10. Iterations stop when max absolute parameter change is less than thr

tol

Threshold for convergence-default value is 1e-10

epsilon

ratio of smallest to largest value of regularisation parameter at which we find parameter estimates

trace

If TRUE the algorithm will print out information as iterations proceed -default is TRUE

log

If TRUE, fraction are spaced uniformly on the log scale

adaptive

If TRUE adaptive lasso is fitted-default is FALSE

separate

If TRUE, the weights in adaptive lasso are build separately using univariate models. Default is FALSE, weights are build using multivariate model

ols

If TRUE, weights less than 1 in adaptive lasso are set to 1. Default is FALSE

p.fact

Weights for adaptive lasso

remove

If TRUE, invariable covariates are removed-default is FALSE

Value

An object of type clogitLasso which is a list with the following components:

beta

nbfraction-by-ncol matrix of estimated coefficients. First row has all 0s

fraction

A sequence of regularisation parameters at which we obtained the fits

A vector of length nbfraction containing the number of nonzero parameter estimates for the fit at the corresponding regularisation parameter

arg

List of arguments

Details

The sequence of models implied by fraction is fit by IRLS (iteratively reweighted least squares) algorithm. by coordinate descent with warm starts and sequential strong rules

References

Avalos, M., Pouyes, H., Grandvalet, Y., Orriols, L., & Lagarde, E. (2015). Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm. BMC bioinformatics, 16(6), S1. 10.1186/1471-2105-16-S6-S1.

Examples

Run this code

# NOT RUN {
# generate data
y <- rep(c(1,0), 100)
X <- matrix (rnorm(20000, 0, 1), ncol = 100) # pure noise
strata <- sort(rep(1:100, 2))

# 1:1
fitLasso <- clogitLasso(X,y,strata,log=TRUE)
# }

Run the code above in your browser using DataLab