l1svm: Training function of L1 norm SVM

Description

The classifier is learned using training data.

Usage

l1svm(x, y, lambda = NULL, rho = 1, thol = 1e-04, maxIter = 1e4, rescale = T)

Arguments

The training dataset represented in a n by d matrix, where n is sample size and d is dimension.

The labels of training dataset represented in a n by d matrix, where n is sample size and d is dimension. The label MUST be encoded as either 1 or -1.

lambda

A sequence of decresing positive numbers to control the regularization. Typical usage is to leave the input lambda = NULL and have the program compute its own lambda sequence lambda = log(d)*c(20:1)/100/n. Users can

rho

The penalty parameter used in the optimization algorithm. Please use it with care - setting rho too large or too small may make the algorithm early stop or diverge.

thol

Stopping Criterion: Maximum relative change of primal and dual parameters.

maxIter

The number of maximum iterations.

rescale

Rescale all variables in the training dataset to 0-1 scale if rescale=T. We strongly recommond users NOT to turn it off.

Value

rescaleThe indicator of the rescaling.
x.minA vector with each entry corresponding to the minimum of each input variable. (Used for rescaling in testing)
x.maxA vector with each entry corresponding to the maximum of each input variable. (Used for rescaling in testing)
lambdaA sequence of regularization parameter used in training.
wThe solution path matrix (d by length of lambda) with each column corresponding to a regularization parameter.
bThe solution path of the intercept.
dfThe degree of freedom of the solution path (The number of non-zero parameters)
labPredicted labels represented in a n by d matrix, where n is sample size and d is dimension.
decDecision values represented in a n by d matrix, where n is sample size and d is dimension.

Details

We adopt the linearized alternative direction method of multipliers. The computation is further accelerated by "warm-start" and "active-set" tricks. Based on our experience, the optimal regularization parameters may dramatically change under different settings. The default lambda sequence cannot secure the performance.

References

T. Zhao and H.Liu. "Sparse Additive Machine", International Conference on Artificial Intelligence and Statistics, 2012. P. Bradley and O. Mangasarian. ""Feature selection via concaveminimization and support vector machines"", International Conference on Machine Learing, 1998.

Examples

Run this code

## generating training data
x = rbind(0.5+matrix(rnorm(100),50,2),-0.5+matrix(rnorm(100),50,2))
x = cbind(x,matrix(rnorm(800),100,8))

## generating labels
y = c(rep(1,50),rep(-1,50))

## Training
fit = l1svm(x,y)
fit

## plotting solution path
plot(fit)

## generating testing data
xt = rbind(0.5+matrix(rnorm(200),50,2),-0.5+matrix(rnorm(200),50,2))
xt = cbind(xt,matrix(rnorm(1600),100,8))

## predicting labels
out = predict(fit,newdata=xt)

Run the code above in your browser using DataLab