Learn R Programming

cvplogistic (version 1.0-0)

auc.cvplogistic: Tuning parameter selection by cross-validated AUC (CV-AUC) criteria for a concave penalized logistic regression

Description

Tuning parameter (kappa, lambda) selection using k-fold cross-validated AUC (CV-AUC) criteria for a concave penalized logistic regression. Only the models with df

Usage

auc.cvplogistic(cv=5,y,x,penalty="mcp",path="kappa",nkappa=20,maxkappa=0.249,
nlambda=100,minlambda=ifelse(n>p,0.0001,0.01),
epsilon=1e-3,maxit=1e+4,seed=1000)

Arguments

Value

A list of five elements is returned.tuning.CVAUCThe CV-AUC value corresponding to chosen tuning parameter.tuning.lambdaThe chosen penalty parameter.tuning.kappaThe chosen regularization parameter.tuning.interceptThe intercept coefficient corresponding to the chosen tuning parameter.tuning.covariatesThe coefficients of variables in x corresponding to the chosen tuning parameter.

Rdversion

1.1

Details

The package implement the majorization minimization by coordinate descent (MMCD) algorithm for computing the solution surface of concave penalized logistic regression model in high-dimensional data. The MMCD algorithm seeks a closed form solution for each coordinate and majorizes the loss function to avoid the computation of scaling factors. The algorithm is efficient and stable for high-dimensional data with p>>n. The package provides three ways to compute solution surfaces for a concave penalized logistic model. The first one is compute along the regularization parameter kappa. That is the Lasso solution (kappa=0) is used to initiate the computation for MCP or SCAD solutions, for a given penalty parameter lambda. The second type is to compute along the penalty parameter lambda. That is for a given regularization parameter kappa, the MCP or SCAD solutions are computed along lambda. The solution surface computed along kappa tends to have a better performance in terms of model size and false discovery rate. Thus, the solution surface along kappa is recommended. The third type of solution is called hybrid algorithm. The hybrid algorithm is specifically designed for the applications which aims to identify the leading causal predictors. In most cases, the hybrid algorithm achieves the same predictive performance as the solution surface along kappa. This hybrid algorithm can be viewed as an variant of the solution surface along kappa. In the hybrid algorithm, Lasso solution (kappa=0) is used as the initial values. The hybrid algorithm, however, only apply the MMCD algorithm to the variables selected by Lasso. That is Lasso is used to pre-process the variables, this practice greatly reduces the computation burden. However, if Lasso misses one variable, it will necessarily removed from the final model. The k-fold cross-validated AUC (CV-AUC) criteria is the average of the predictive AUC of k validation datasets generated by the cross-validation process. The solution of both raw dataset and cross-validation samples are computed as the way the user specified. The tuning parameter that maximize the cross-validated AUC is chosen to be the optimal tuning parameters. We only consider the models with df

References

Dingfeng Jiang, Jian Huang. Majorization Minimization by Coordinate Descent for Concave Penalized Generalized Linear Models. Dingfeng Jiang, Jian Huang, Ying Zhang. The Cross-Validated AUC for MCP-Logistic Regression with High-dimensional Data. Statistical Methods in Medical Research.

See Also

cvplogistic, aic.cvplogistic, bic.cvplogistic

Examples

Run this code
seed=10000
n=100
y=rbinom(n,1,0.4)
p=50
x=matrix(rnorm(n*p),n,p)
penalty="mcp"
nkappa=5
maxkappa=0.249
nlambda=20
cv=5
path="kappa"
out=auc.cvplogistic(cv,y,x,penalty,path,nkappa,maxkappa,nlambda)
path="lambda"
out=auc.cvplogistic(cv,y,x,penalty,path,nkappa,maxkappa,nlambda)
path="hybrid"
out=auc.cvplogistic(cv,y,x,penalty,path,nkappa,maxkappa,nlambda)
penalty="scad"
maxkappa=0.19
path="kappa"
out=auc.cvplogistic(cv,y,x,penalty,path,nkappa,maxkappa,nlambda)
path="lambda"
out=auc.cvplogistic(cv,y,x,penalty,path,nkappa,maxkappa,nlambda)
path="hybrid"
out=auc.cvplogistic(cv,y,x,penalty,path,nkappa,maxkappa,nlambda)

Run the code above in your browser using DataLab