cv.dcsvm: Cross-Validation for Sparse Density-Convoluted SVM

Description

Performs cross-validation for the sparse density-convoluted SVM to estimate the optimal tuning parameter lambda.

Usage

cv.dcsvm(x, y, lambda = NULL, hval = 1, 
  pred.loss = c("misclass", "loss"), nfolds = 5, foldid, ...)

Value

A cv.dcsvm object is returned, which includes the cross-validation fit:

lambda: The lambda sequence used in dcsvm.
cvm: A vector of length length(lambda) for the mean cross-validated error.
cvsd: A vector of length length(lambda) for estimates of standard error of cvm.
cvupper: The upper curve: cvm + cvsd.
cvlower: The lower curve: cvm - cvsd.
nzero: Number of non-zero coefficients at each lambda.
name: "Mis-classification error", for plotting purposes.
dcsvm.fit: A fitted dcsvm object using the full data.
lambda.min: The lambda incurring the minimum cross-validation error cvm.
lambda.1se: The largest value of lambda such that error is within one standard error of the minimum.
cv.min: The minimum cross-validation error.
cv.1se: The cross-validation error associated with lambda.1se.

Arguments

x: A matrix of predictors, i.e., the x matrix used in dcsvm.
y: A vector of binary class labels, i.e., the y used in dcsvm.
lambda: Default is NULL, and the sequence generated by dcsvm is used. User can also provide a new lambda sequence for cross-validation.
hval: The bandwidth parameter for kernel smoothing. Default is 1.
pred.loss: "misclass" for classification error, "loss" for the density-convoluted SVM loss.
nfolds: The number of folds. Default is 5. The allowable range is from 3 to the sample size. Larger nfolds increases computational time.
foldid: An optional vector with values between 1 and nfold, representing the fold indices for each observation. If supplied, nfolds can be missing.
...: Other arguments that can be passed to dcsvm.

Details

Cross-Validation for Sparse Density-Convoluted SVM

Conducts a k-fold cross-validation for dcsvm and returns the suggested values of the L1 parameter lambda.

This function runs dcsvm on the sparse density-convoluted SVM by excluding each fold in turn, then computes the mean cross-validation error and standard deviation. It is adapted from the cv functions in the gcdnet and glmnet packages.

Examples

Run this code

data(colon)
colon$x <- colon$x[ ,1:100] # Use only the first 100 columns for this example
n <- nrow(colon$x)
set.seed(1)
id <- sample(n, trunc(n / 3))
cvfit <- cv.dcsvm(colon$x[-id, ], colon$y[-id], lam2=1, nfolds=5)
plot(cvfit)
predict(cvfit, newx=colon$x[id, ], s="lambda.min")