Learn R Programming

quanteda.textmodels (version 0.9.2)

textmodel_svmlin: (faster) Linear SVM classifier for texts

Description

Fit a fast linear SVM classifier for sparse text matrices, using svmlin C++ code written by Vikas Sindhwani and S. Sathiya Keerthi. This method implements the modified finite Newton L2-SVM method (L2-SVM-MFN) method described in Sindhwani and Keerthi (2006). Currently, textmodel_svmlin() only works for two-class problems.

Usage

textmodel_svmlin(
  x,
  y,
  intercept = TRUE,
  lambda = 1,
  cp = 1,
  cn = 1,
  scale = FALSE,
  center = FALSE
)

Arguments

x

the dfm on which the model will be fit. Does not need to contain only the training documents.

y

vector of training labels associated with each document identified in train. (These will be converted to factors if not already factors.)

intercept

logical; if TRUE, add an intercept to the data

lambda

numeric; regularization parameter lambda (default 1)

cp

numeric; Relative cost for "positive" examples (the second factor level)

cn

numeric; Relative cost for "negative" examples (the first factor level)

scale

logical; if TRUE, normalize the feature counts

center

logical; if TRUE, centre the feature counts

Value

a fitted model object of class textmodel_svmlin

References

Vikas Sindhwani and S. Sathiya Keerthi (2006). Large Scale Semi-supervised Linear SVMs. Proceedings of ACM SIGIR. August 6<U+2013>11, 2006, Seattle.

V. Sindhwani and S. Sathiya Keerthi (2006). Newton Methods for Fast Solution of Semi-supervised Linear SVMs. Book Chapter in Large Scale Kernel Machines, MIT Press, 2006.

See Also

predict.textmodel_svmlin()

Examples

Run this code
# NOT RUN {
# use Lenihan for govt class and Bruton for opposition
quanteda::docvars(data_corpus_irishbudget2010, "govtopp") <-
    c("Govt", "Opp", rep(NA, 12))
dfmat <- quanteda::dfm(data_corpus_irishbudget2010)

tmod <- textmodel_svmlin(dfmat, y = quanteda::docvars(dfmat, "govtopp"))
predict(tmod)
# }

Run the code above in your browser using DataLab