heuristicC
implements a heuristics proposed by Thorsten Joachims in
order to make fast estimates of a convenient value for the C constant used by
support vector machines. This implementation only works for linear support
vector machines.
heuristicC(data)
a nxp data matrix. Each row stands for an example (sample, point) and each column stands for a dimension (feature, variable)
A value for the C constant is returned, computed as follows: \(\frac{1}{\frac{1}{n}\sum_{i=1}^{n}\sqrt{G[i,i]}}\) where \(G=\code{data}\%*\%t(\code{data})\)
T. Joachims SVM light (2002) http://svmlight.joachims.org
# NOT RUN { data(iris) x=iris[,1:4] y=factor(iris[,5]) train=sample(1:dim(iris)[1],100) xTrain=x[train,] xTest=x[-train,] yTrain=y[train] yTest=y[-train] # Center and scale data s=scale(xTrain,center=TRUE,scale=TRUE) # Sparse Logistic Regression t=6 co=heuristicC(s) m=LiblineaR(data=s,labels=yTrain,type=t,cost=co,bias=TRUE,verbose=FALSE) # Scale the test data s2=scale(xTest,attr(s,"scaled:center"),attr(s,"scaled:scale")) # Make prediction p=predict(m,s2) # Display confusion matrix res=table(p$predictions,yTest) print(res) # Compute Balanced Classification Rate BCR=mean(c(res[1,1]/sum(res[,1]),res[2,2]/sum(res[,2]),res[3,3]/sum(res[,3]))) print(BCR) # }