R/Weka Classifier Functions
R interfaces to Weka regression and classification function learners.
LinearRegression(formula, data, subset, na.action, control = Weka_control(), options = NULL) Logistic(formula, data, subset, na.action, control = Weka_control(), options = NULL) SMO(formula, data, subset, na.action, control = Weka_control(), options = NULL)
- a symbolic description of the model to be fit.
- an optional data frame containing the variables in the model.
- an optional vector specifying a subset of observations to be used in the fitting process.
- a function which indicates what should happen when
the data contain
- an object of class
Weka_controlgiving options to be passed to the Weka learner. Available options can be obtained on-line using the Weka Option Wizard
- a named list of further options, or
NULL(default). See Details.
LinearRegression builds suitable linear regression models,
using the Akaike criterion for model selection.
Logistic builds multinomial logistic regression models based on
ridge estimation (le Cessie and van Houwelingen, 1992).
SMO implements John C. Platt's sequential minimal optimization
algorithm for training a support vector classifier using polynomial or
RBF kernels. Multi-class problems are solved using pairwise
The model formulae should only use the + and - operators to indicate the variables to be included or not used, respectively.
options allows further customization. Currently,
instances (or partial matches for
these) are used: if set to
TRUE, the model frame or the
corresponding Weka instances, respectively, are included in the fitted
model object, possibly speeding up subsequent computations on the
object. By default, neither is included.
- A list inheriting from classes
Weka_classifierswith components including
classifier a reference (of class
jobjRef) to a Java object obtained by applying the Weka
buildClassifiermethod to build the specified model using the given control options.
predictions a numeric vector or factor with the model predictions for the training instances (the results of calling the Weka
classifyInstancemethod for the built classifier and each instance).
call the matched call.
J. C. Platt (1998). Fast training of Support Vector Machines using Sequential Minimal Optimization. In B. Schoelkopf, C. Burges, and A. Smola (eds.), Advances in Kernel Methods --- Support Vector Learning. MIT Press.
I. H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.
## Linear regression: ## Using standard data set 'mtcars'. LinearRegression(mpg ~ ., data = mtcars) ## Compare to R: step(lm(mpg ~ ., data = mtcars), trace = 0) ## Using standard data set 'chickwts'. LinearRegression(weight ~ feed, data = chickwts) ## (Note the interactions!) ## Logistic regression: ## Using standard data set 'infert'. STATUS <- factor(infert$case, labels = c("control", "case")) Logistic(STATUS ~ spontaneous + induced, data = infert) ## Compare to R: glm(STATUS ~ spontaneous + induced, data = infert, family = binomial()) ## Sequential minimal optimization algorithm for training a support ## vector classifier, using am RBF kernel with a non-default gamma ## parameter (argument '-G') instead of the default polynomial kernel ## (from a question on r-help): SMO(Species ~ ., data = iris, control = Weka_control(K = list("weka.classifiers.functions.supportVector.RBFKernel", G = 2))) ## In fact, by some hidden magic it also "works" to give the "base" name ## of the Weka kernel class: SMO(Species ~ ., data = iris, control = Weka_control(K = list("RBFKernel", G = 2)))