RWeka (version 0.2-6)

Weka_classifier_trees: R/Weka Classifier Trees

Description

R interfaces to Weka regression and classification tree learners.

Usage

J48(formula, data, subset, na.action, control = Weka_control())
LMT(formula, data, subset, na.action, control = Weka_control())
M5P(formula, data, subset, na.action, control = Weka_control())
DecisionStump(formula, data, subset, na.action, control = Weka_control())

Arguments

formula
a symbolic description of the model to be fit.
data
an optional data frame containing the variables in the model.
subset
an optional vector specifying a subset of observations to be used in the fitting process.
na.action
a function which indicates what should happen when the data contain NAs.
control
an object of class Weka_control. Available options can be obtained on-line using the Weka Option Wizard WOW, or the Weka documentation.

Value

  • A list inheriting from classes Weka_tree and Weka_classifiers with components including
  • classifiera reference (of class jobjRef) to a Java object obtained by applying the Weka buildClassifier method to build the specified model using the given control options.
  • predictionsa numeric vector or factor with the model predictions for the training instances (the results of calling the Weka classifyInstance method for the built classifier and each instance).
  • callthe matched call.

Details

There is a predict method for predicting from the fitted models.

There is also a plot method for fitted binary Weka_trees via the facilities provided by package party. This converts the Weka_tree to a BinaryTree and then simply calls the plot method of this class (see plot.BinaryTree) with slight modifications to the default arguments.

Provided the Weka classification tree learner implements the Drawable interface (i.e., provides a graph method), write_to_dot can be used to create a DOT representation of the tree for visualization via GraphViz.

J48 generates unpruned or pruned C4.5 decision trees (Quinlan, 1993).

LMT implements Logistic Model Trees (Landwehr, 2003; Landwehr et al., 2005).

M5P (where the P stands for prime) generates M5 model trees using the M5' algorithm, which was introduced in Wang & Witten (1997) and enhances the original M5 algorithm by Quinlan (1992).

DecisionStump implements decision stumps (trees with a single split only), which are frequently used as base learners for meta learners such as Boosting.

References

N. Landwehr (2003). Logistic Model Trees. Master's thesis, Institute for Computer Science, University of Freiburg, Germany. http://www.informatik.uni-freiburg.de/~ml/thesis_landwehr2003.html

N. Landwehr, Mark Hall and Eibe Frank (2005). Logistic Model Trees. Machine Learning, 59, 161--205. R. Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA.

R. Quinlan (1992). Learning with continuous classes. Proceedings of the Australian Joint Conference on Artificial Intelligence, 343--348. World Scientific, Singapore.

Y. Wang and I. H. Witten (1997). Induction of model trees for predicting continuous classes. Proceedings of the European Conference on Machine Learning. University of Economics, Faculty of Informatics and Statistics, Prague.

I. H. Witten and Eibe Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.

Examples

Run this code
data("iris")
m1 <- J48(Species ~ ., data = iris)
m1
table(iris$Species, predict(m1))
write_to_dot(m1)
if(require("party", quietly = TRUE)) plot(m1)

## Using some Weka data sets ...

## J48
DF2 <- read.arff(system.file("arff", "contact-lenses.arff",
                             package = "RWeka"))
m2 <- J48(`contact-lenses` ~ ., data = DF2)
m2
table(DF2$`contact-lenses`, predict(m2))
if(require("party", quietly = TRUE)) plot(m2)

## M5P
DF3 <- read.arff(system.file("arff", "cpu.arff", package = "RWeka"))
m3 <- M5P(class ~ ., data = DF3)
m3
if(require("party", quietly = TRUE)) plot(m3)

## Logistic Model Tree.
DF4 <- read.arff(system.file("arff", "weather.arff", package = "RWeka"))
m4 <- LMT(play ~ ., data = DF4)
m4
table(DF4$play, predict(m4))

## Larger scale example.
if(require("mlbench", quietly = TRUE)
   && require("party", quietly = TRUE)) {
    ## Predict diabetes status for Pima Indian women
    data("PimaIndiansDiabetes", package = "mlbench")
    ## Fit J48 tree with reduced error pruning
    m5 <- J48(diabetes ~ ., data = PimaIndiansDiabetes,
              control = Weka_control(R = TRUE))
    plot(m5)
    ## (Make sure that the plotting device is big enough for the tree.)
}

Run the code above in your browser using DataCamp Workspace