Learn R Programming

SwarmSVM (version 0.1)

dcSVM: Divide-and-Conquer kernel SVM (DC-SVM)

Description

Implementation of Divide-and-Conquer kernel SVM (DC-SVM) by Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon

Usage

dcSVM(x, y, k = 4, m, kernel = 3, max.levels, early = 0, final.training = FALSE, pre.scale = FALSE, seed = NULL, verbose = TRUE, valid.x = NULL, valid.y = NULL, valid.metric = NULL, cluster.method = "kmeans", cluster.fun = NULL, cluster.predict = NULL, ...)

Arguments

x
the nxp training data matrix. Could be a matrix or a sparse matrix object.
y
a response vector for prediction tasks with one value for each of the n rows of x. For classification, the values correspond to class labels and can be a 1xn matrix, a simple vector or a factor. For regression, the values correspond to the values to predict, and can be a 1xn matrix or a simple vector.
k
the number of sub-problems divided
m
the number of sample for kernel kmeans
kernel
the kernel type: 1 for linear, 2 for polynomial, 3 for gaussian
max.levels
the maximum number of level
early
whether use early prediction
final.training
whether train the svm over the entire data again. usually not needed.
pre.scale
either a logical value indicating whether to scale the data or not, or an integer vector specifying the columns. We don't scale data in SVM seperately.
seed
the random seed. Set it to NULL to randomize the model.
verbose
a logical value indicating whether to print information of training.
valid.x
the mxp validation data matrix.
valid.y
if provided, it will be used to calculate the validation score with valid.metric
valid.metric
the metric function for the validation result. By default it is the accuracy for classification. Customized metric is acceptable.
cluster.method
The clusterign algorithm to use. Possible choices are
  • "kmeans" Algorithm from stats::kmeans
  • "mlKmeans" Algorithm from RcppMLPACK::mlKmeans
  • "kernkmeans" Algorithm from kernlab::kkmeans

If cluster.fun and cluster.predict are provided, cluster.method doesn't work anymore.

cluster.fun
The function to train cluster labels for the data based on given number of centers. Customized function is acceptable, as long as the resulting list contains two fields named as cluster and centers.
cluster.predict
The function to predict cluster labels for the data based on trained object. Customized function is acceptable, as long as the resulting list contains two fields named as cluster and centers.
...
other parameters passed to e1071::svm

Value

  • svm a list of svm models if using early prediction, or an svm object otherwise.
  • early whether using the early prediction strategy or not
  • cluster.tree a matrix containing clustering labels in each level
  • cluster.fun the clustering training function
  • cluster.predict the clustering predicting function
  • scale a list containing scaling information
  • valid.pred the validation prediction
  • valid.score the validation score
  • valid.metric the validation metric
  • time a list object recording the time consumption for each steps.

Examples

Run this code
data(svmguide1)
svmguide1.t = as.matrix(svmguide1[[2]])
svmguide1 = as.matrix(svmguide1[[1]])
dcsvm.model = dcSVM(x = svmguide1[,-1], y = svmguide1[,1],
                    k = 4, max.levels = 4, seed = 0, cost = 32, gamma = 2,
                    kernel = 3,early = 0, m = 800,
                    valid.x = svmguide1.t[,-1], valid.y = svmguide1.t[,1])
preds = dcsvm.model$valid.pred
table(preds, svmguide1.t[,1])
dcsvm.model$valid.score

Run the code above in your browser using DataLab