Learn R Programming

roclab (version 0.1.4)

cv.kroclearn: Cross-validation for kernel models

Description

Perform k-fold cross-validation over a sequence of \(\lambda\) values and select the optimal model based on AUC.

Usage

cv.kroclearn(
  X,
  y,
  lambda.vec = NULL,
  lambda.length = 30,
  kernel = "radial",
  param.kernel = NULL,
  loss = "hinge",
  approx = NULL,
  intercept = TRUE,
  nfolds = 10,
  target.perf = list(),
  param.convergence = list()
)

Value

An object of class "cv.kroclearn" with:

  • optimal.lambda — selected \(\lambda\).

  • optimal.fit — model trained at optimal.lambda.

  • lambda.vec — grid of penalty values considered.

  • auc.mean, auc.sd — mean and sd of cross-validated AUC.

  • auc.result — fold-by-lambda AUC matrix.

  • time.mean, time.sd — mean and sd of training time.

  • time.result — fold-by-lambda training time matrix.

  • nfolds, loss, kernel — settings.

Arguments

X

Predictor matrix or data.frame (categorical variables are automatically one-hot encoded).

y

Response vector with class labels in {-1, 1}. Labels given as {0, 1} or as a two-level factor/character are automatically converted to this format.

lambda.vec

Optional numeric vector of regularization parameters (lambda values). If NULL (default), a decreasing sequence is generated automatically.

lambda.length

Number of \(\lambda\) values to generate if lambda.vec is NULL. Default is 30.

kernel

Kernel type: "radial" (default), "polynomial", "linear", or "laplace".

param.kernel

Kernel-specific parameter:

  • \(\sigma\) for "radial" and "laplace" kernels (default \(1/p\), where \(p\) is the number of predictors after preprocessing, i.e., after categorical variables are one-hot encoded).

  • Degree for "polynomial" kernel (default 2).

  • Ignored for "linear" kernel.

loss

Surrogate loss function type. One of: "hinge" (default), "hinge2" (squared hinge), "logistic", or "exponential".

approx

Logical; enables a scalable approximation to accelerate training. The default is TRUE when nrow(X) >= 1000, and FALSE otherwise. For details about how approximation is applied, see the details section of the kroclearn function.

intercept

Logical; include an intercept in the model (default TRUE).

nfolds

Number of cross-validation folds (default 10).

target.perf

List with target sensitivity and specificity used when estimating the intercept (defaults to 0.9 each).

param.convergence

List of convergence controls (e.g., maxiter, eps). Default is list(maxiter = 5e4, eps = 1e-4).

See Also

kroclearn

Examples

Run this code
set.seed(123)
n <- 100
r <- sqrt(runif(n, 0.05, 1))
theta <- runif(n, 0, 2*pi)
X <- cbind(r * cos(theta), r * sin(theta))
y <- ifelse(r < 0.5, 1, -1)

cvfit <- cv.kroclearn(
  X, y,
  lambda.vec = exp(seq(log(0.01), log(5), length.out = 3)),
  kernel = "radial",
  approx=TRUE, nfolds = 2
)
cvfit$optimal.lambda

Run the code above in your browser using DataLab