tuneParam: Tune parameters w and lamda using the CL penalty

Description

Does k-fold cross-validation with the function optimPenalLik and returns the values of lamda and w that maximize the area under the ROC.

Usage

tuneParam(Data, nfolds = nfolds, grid, algorithm = c("hjk", "QN"))

Arguments

Data

a data frame, as a first column should have the response variable y and the other columns the predictors

nfolds

the number of folds used for cross-validation. OBS! nfolds>=2

grid

a grid (data frame) with values of lamda and w that will be used for tuning to tune the model. It is created by expand.grid see example below

algorithm

choose between BFGS ("QN") and hjk (Hooke-Jeeves optimization free) to be used for optmization

Value

A matrix with the following: the average (over folds) cross-validated AUC, the totalVariables selected on the training set, and the standard deviation of the AUC over the nfolds.

Details

It supports the BFGS optimization method ('QN') from the optim stats function, the Hooke-Jeeves derivative-free minimization algorithm ('hjk') The value of lamda and w that yield the maximum AUC on the cross-validating data set is selected.

Examples

Run this code

# NOT RUN {
set.seed(14)
beta    <- c(3, 2, -1.6, -4)
noise   <- 5
simData <- SimData(N=100, beta=beta, noise=noise, corr=TRUE)

nfolds  <- 3
grid <- expand.grid(w = c( 0.3, 0.7),
                   lamda = c(1.5))

before <- Sys.time()
paramCV <- tuneParam(simData, nfolds, grid, algorithm=c("QN"))
(totalTime <- Sys.time() - before)


maxAUC    <- paramCV[which.max(paramCV$AUC),]$AUC
allmaxAUC <- paramCV[which(paramCV$AUC==maxAUC),] # checks if the value of AUC
# is unique; if is not unique then it will take the combination of lamda and
# w where lamda has the largest value- thus achieving higher sparsity

runQN   <- optimPenaLik(simData, lamda= allmaxAUC[nrow(allmaxAUC),]$lamda,
                         w= allmaxAUC[nrow(allmaxAUC),]$w,
                         algorithms=c("QN"))
(coefQN  <- runQN$varQN)


# check the robustness of the choice of lamda

runQN2   <- optimPenaLik(simData, lamda= allmaxAUC[1,]$lamda,
                         w= allmaxAUC[1,]$w,
                         algorithms=c("QN"))
(coefQN2  <- runQN2$varQN)

# }

Run the code above in your browser using DataLab