quanteda.textmodels (version 0.9.1)

textmodel_svmlin: (faster) Linear SVM classifier for texts

Description

This function has been retained for testing purposes only; we recommend that you use textmodel_svm instead. That function is more efficient, and implements prediction for more than two classes.

Usage

textmodel_svmlin(x, y, intercept = TRUE, ...)

Arguments

x

the dfm on which the model will be fit. Does not need to contain only the training documents.

y

vector of training labels associated with each document identified in train. (These will be converted to factors if not already factors.)

intercept

logical; if TRUE, add an intercept to the data

...

additional arguments passed to svmlin

Value

textmodel_svmlin() returns (for now) an object structured as a return object from svmlin.

Details

Fit a fast linear SVM classifier for texts, using the R interface to the svmlin code by Vikas Sindhwani and S. Sathiya Keerthi for fast linear transductive SVMs. This is passed through to svmlin as implemented by the RSSL package.

References

Vikas Sindhwani and S. Sathiya Keerthi (2006). Large Scale Semi-supervised Linear SVMs. Proceedings of ACM SIGIR.

V. Sindhwani and S. Sathiya Keerthi (2006). Newton Methods for Fast Solution of Semi-supervised Linear SVMs. Book Chapter in Large Scale Kernel Machines, MIT Press, 2006.

See Also

svmlin, text{textmodel_svm}

Examples

Run this code
# NOT RUN {
# use Lenihan for govt class and Bruton for opposition
quanteda::docvars(data_corpus_irishbudget2010, "govtopp") <-
    c("Govt", "Opp", rep(NA, 12))
dfmat <- quanteda::dfm(data_corpus_irishbudget2010)

tmod <- textmodel_svmlin(dfmat, y = quanteda::docvars(dfmat, "govtopp"),
                         pos_frac = 5/14)
predict(tmod)

predict(textmodel_svmlin(dfmat, y = quanteda::docvars(dfmat, "govtopp"),
                         intercept = FALSE, pos_frac = 5/14))
# }

Run the code above in your browser using DataCamp Workspace