oscar: OSCAR Penalty

Description

Object of the penalty class to handle the OSCAR penalty (Bondell & Reich, 2008)

Usage

oscar (lambda = NULL, ...)

Arguments

lambda

two-dimensional tuning parameter. The first component corresponds to the regularization parameter $\lambda$ that drives the relevance of the OSCAR penalty for likelihood inference. The second component corresponds to $c$ (see details below) Both must be n

...

further arguments

Value

An object of the class penalty. This is a list with elements
penaltycharacter: the penalty name.
lambdadouble: the (nonnegative) regularization parameter.
first.derivativefunction: This returns the J-dimensional vector of the first derivative of the J penalty terms with respect to $|\mathbf{a}^\top_j\boldsymbol{\beta|}$.
a.coefsfunction: This returns the p-dimensional coefficient vector $\mathbf{a}_j$ of the J penalty terms.

Details

Bondell & Reich (2008) propose a shrinkage method for linear models called OSCAR that simultaneously select variables while grouping them into predictive clusters. The OSCAR penalty is defined as $$P_{\tilde{\lambda}}^{osc}(\boldsymbol{\beta}) = \lambda\left( \sum_{k=1}^p |\beta_k| + c \sum_{j < k} \max{|\beta_j|, |\beta_k|} \right), \quad \tilde{\lambda} = (\lambda, c)$$ where $c \geq 0$ and $\lambda > 0$ are tuning parameters with c controlling the relative weighting of the $L_\infty$-norms and $\lambda$ controlling the magnitude of penalization. The $L_1$-norm entails sparsity, while the pairwise maximum ($L_\infty$-)norm encourages equality of coefficients.

Due to equation (3) in Bondell & Reich (2008), we use the alternative formulation $$P_{\tilde{\lambda}}^{osc}(\boldsymbol{\beta}) = \lambda \sum_{j=1}^p {c(j-1) + 1}|\beta|_{(j)},$$ where $|\beta|_{(1)} \leq |\beta|_{(2)} \leq \ldots \leq |\beta|_{(p)}$ denote the ordered absolute values of the coefficients. However, there could be some difficulties in the LQA algorithm since we need an ordering of regressors which can differ between two adjacent iterations. In the worst case, this can lead to oscillations and hence to no convergence of the algorithm. Hence, for the OSCAR penalty it is recommend to use $\gamma < 1$, e.g. $\gamma = 0.01$ when to apply lqa.update2 for fitting the GLM in order to facilitate convergence.

References

Bondell, H. D. & B. J. Reich (2008) Simultaneous regression shrinkage, variable selection and clustering of predictors with oscar. Biometrics 64, 115--123.