Weka_classifier_rules: R/Weka Rule Learners

Description

R interfaces to Weka rule learners.

Usage

JRip(formula, data, subset, na.action,
     control = Weka_control(), options = NULL)
M5Rules(formula, data, subset, na.action,
        control = Weka_control(), options = NULL)
OneR(formula, data, subset, na.action,
     control = Weka_control(), options = NULL)
PART(formula, data, subset, na.action,
     control = Weka_control(), options = NULL)

Arguments

formula

a symbolic description of the model to be fit.

data

an optional data frame containing the variables in the model.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. See model.frame for details.

control

an object of class Weka_control giving options to be passed to the Weka learner. Available options can be obtained on-line using the Weka Option Wizard WOW, or the Weka documentation.

options

a named list of further options, or NULL (default). See Details.

Value

A list inheriting from classes Weka_rules and Weka_classifiers with components including

classifier

a reference (of class jobjRef) to a Java object obtained by applying the Weka buildClassifier method to build the specified model using the given control options.

predictions

a numeric vector or factor with the model predictions for the training instances (the results of calling the Weka classifyInstance method for the built classifier and each instance).

call

the matched call.

Details

There are a predict method for predicting from the fitted models, and a summary method based on evaluate_Weka_classifier.

JRip implements a propositional rule learner, “Repeated Incremental Pruning to Produce Error Reduction” (RIPPER), as proposed by Cohen (1995).

M5Rules generates a decision list for regression problems using separate-and-conquer. In each iteration it builds an model tree using M5 and makes the “best” leaf into a rule. See Hall, Holmes and Frank (1999) for more information.

OneR builds a simple 1-R classifier, see Holte (1993).

PART generates PART decision lists using the approach of Frank and Witten (1998).

The model formulae should only use the + and - operators to indicate the variables to be included or not used, respectively.

Argument options allows further customization. Currently, options model and instances (or partial matches for these) are used: if set to TRUE, the model frame or the corresponding Weka instances, respectively, are included in the fitted model object, possibly speeding up subsequent computations on the object. By default, neither is included.

References

W. W. Cohen (1995). Fast effective rule induction. In A. Prieditis and S. Russell (eds.), Proceedings of the 12th International Conference on Machine Learning, pages 115--123. Morgan Kaufmann. ISBN 1-55860-377-8. http://citeseer.ist.psu.edu/cohen95fast.html

E. Frank and I. H. Witten (1998). Generating accurate rule sets without global optimization. In J. Shavlik (ed.), Machine Learning: Proceedings of the Fifteenth International Conference. Morgan Kaufmann Publishers: San Francisco, CA. http://www.cs.waikato.ac.nz/~eibe/pubs/ML98-57.ps.gz

M. Hall, G. Holmes, and E. Frank (1999). Generating rule sets from model trees. Proceedings of the Twelfth Australian Joint Conference on Artificial Intelligence, Sydney, Australia, pages 1--12. Springer-Verlag. http://citeseer.ist.psu.edu/holmes99generating.html

R. C. Holte (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63--91. 10.1023/A:1022631118932.

I. H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.

Examples

Run this code

# NOT RUN {
M5Rules(mpg ~ ., data = mtcars)

m <- PART(Species ~ ., data = iris)
m
summary(m)
# }

Run the code above in your browser using DataLab