bic.cvplogistic: Tuning parameter selection by BIC criteria for a concave penalized logistic regression

Description

Tuning parameter (kappa, lambda) selection using BIC criteria for a concave penalized logistic regression. Only the models with df

Usage

bic.cvplogistic(y, x, penalty = "mcp", approach = "mmcd",
path = "kappa", nkappa = 10, maxkappa = 0.249,
nlambda = 100, minlambda = 0.01,
epsilon = 1e-3, maxit = 1e+3)

Arguments

Value

A list of four elements is returned.sbicthe BIC value corresponding to the chosen tuning parameters.slambdathe chosen penalty parameter.skappathe chosen regularization parameter.scoefthe coefficients of variables in x corresponding to the chosen tuning parameter.

Rdversion

2.0

Details

The package implement the majorization minimization by coordinate descent (MMCD) algorithm for computing the solution surface of concave penalized logistic regression model in high-dimensional data. The MMCD algorithm seeks a closed form solution for each coordinate and majorizes the loss function to avoid the computation of scaling factors. The algorithm is efficient and stable for high-dimensional data with p>>n. The package provides three ways to compute solution surfaces for a concave penalized logistic model. The first one is compute along the regularization parameter kappa. That is the Lasso solution (kappa=0) is used to initiate the computation for MCP or SCAD solutions, for a given penalty parameter lambda. The second type is to compute along the penalty parameter lambda. That is for a given regularization parameter kappa, the MCP or SCAD solutions are computed along lambda. The solution surface computed along kappa tends to have a better performance in terms of model size and false discovery rate. Thus, the solution surface along kappa is recommended. The third type of solution is called hybrid algorithm. The hybrid algorithm is specifically designed for the applications which aims to identify the leading causal predictors. In most cases, the hybrid algorithm achieves the same predictive performance as the solution surface along kappa. This hybrid algorithm can be viewed as an variant of the solution surface along kappa. In the hybrid algorithm, Lasso solution (kappa=0) is used as the initial values. The hybrid algorithm, however, only apply the MMCD algorithm to the variables selected by Lasso. That is Lasso is used to pre-process the variables, this practice greatly reduces the computation burden. However, if Lasso misses one variable, it will necessarily removed from the final model. The tuning parameters including the regularization parameter kappa and penalty parameter lambda are determined by the BIC criterion. The solutions corresponding to the chosen tuning parameters is output as the solution for the model. We only consider the models with df

References

Dingfeng Jiang, Jian Huang. Majorization Minimization by Coordinate Descent for Concave Penalized Generalized Linear Models. Zou, H., Li, R. (2008). One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. Ann Stat, 364: 1509-1533. Breheny, P., Huang, J. (2011). Coordinate Descent Algorithms for Nonconvex Penalized Regression, with Application to Biological Feature Selection. Ann Appl Stat, 5(1), 232-253. Jiang, D., Huang, J., Zhang, Y. (2011). The Cross-validated AUC for MCP-Logistic Regression with High-dimensional Data. Stat Methods Med Res, online first, Nov 28, 2011.

Examples

Run this code

set.seed(10000)
n=100
y=rbinom(n,1,0.4)
p=50
x=matrix(rnorm(n*p),n,p)
nkappa=5
maxkappa=0.249
nlambda=20
## MCP penalty
penalty="mcp"
approach="mmcd"
path="kappa"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa,
nlambda)
path="lambda"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
path="hybrid"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
approach="adaptive"
path="kappa"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
path="lambda"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
path="hybrid"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
## using LLA approach, path option has no effect.
approach="llacda"
maxkappa=0.99
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
## SCAD penalty
maxkappa=0.19
penalty="scad"
path="kappa"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
path="lambda"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)
path="hybrid"
out=bic.cvplogistic(y, x, penalty, approach, path, nkappa, maxkappa, nlambda)

Run the code above in your browser using DataLab