# Weka_classifier_trees

##### R/Weka Classifier Trees

R interfaces to Weka regression and classification tree learners.

- Keywords
- models, regression, classif, tree

##### Usage

```
J48(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
LMT(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
M5P(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
DecisionStump(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
```

##### Arguments

- formula
- a symbolic description of the model to be fit.
- data
- an optional data frame containing the variables in the model.
- subset
- an optional vector specifying a subset of observations to be used in the fitting process.
- na.action
- a function which indicates what should happen when
the data contain
`NA`

s. See`model.frame`

for details. - control
- an object of class
`Weka_control`

giving options to be passed to the Weka learner. Available options can be obtained on-line using the Weka Option Wizard`WOW`

- options
- a named list of further options, or
`NULL`

(default). See**Details**.

##### Details

There are a `predict`

method for
predicting from the fitted models, and a `summary`

method based
on `evaluate_Weka_classifier`

.

There is also a `plot`

method for fitted binary `Weka_tree`

s
via the facilities provided by package `Weka_tree`

to a `BinaryTree`

and then simply calls the plot
method of this class (see `plot.BinaryTree`

) with
slight modifications to the default arguments.

Provided the Weka classification tree learner implements the
`graph`

method),
`write_to_dot`

can be used to create a DOT representation
of the tree for visualization via Graphviz or the

`J48`

generates unpruned or pruned C4.5 decision trees (Quinlan,
1993).

`LMT`

implements

`M5P`

(where the `P` stands for

`DecisionStump`

implements decision stumps (trees with a single
split only), which are frequently used as base learners for meta
learners such as Boosting.

The model formulae should only use the `+` and `-` operators
to indicate the variables to be included or not used, respectively.

Argument `options`

allows further customization. Currently,
options `model`

and `instances`

(or partial matches for
these) are used: if set to `TRUE`

, the model frame or the
corresponding Weka instances, respectively, are included in the fitted
model object, possibly speeding up subsequent computations on the
object. By default, neither is included.

##### Value

- A list inheriting from classes
`Weka_tree`

and`Weka_classifiers`

with components including classifier a reference (of class `jobjRef`

) to a Java object obtained by applying the Weka`buildClassifier`

method to build the specified model using the given control options.predictions a numeric vector or factor with the model predictions for the training instances (the results of calling the Weka `classifyInstance`

method for the built classifier and each instance).call the matched call.

##### References

N. Landwehr (2003).
*Logistic Model Trees*.
Master's thesis, Institute for Computer Science, University of
Freiburg, Germany.

N. Landwehr, M. Hall, and E. Frank (2005).
Logistic Model Trees.
*Machine Learning*, **59**, 161--205.
R. Quinlan (1993).
*C4.5: Programs for Machine Learning*.
Morgan Kaufmann Publishers, San Mateo, CA.

R. Quinlan (1992).
Learning with continuous classes.
*Proceedings of the Australian Joint Conference on Artificial
Intelligence*, 343--348.
World Scientific, Singapore.

Y. Wang and I. H. Witten (1997).
Induction of model trees for predicting continuous classes.
*Proceedings of the European Conference on Machine
Learning*.
University of Economics, Faculty of Informatics and Statistics,
Prague.

I. H. Witten and E. Frank (2005).
*Data Mining: Practical Machine Learning Tools and Techniques*.
2nd Edition, Morgan Kaufmann, San Francisco.

##### See Also

##### Examples

```
m1 <- J48(Species ~ ., data = iris)
## print and summary
m1
summary(m1) # calls evaluate_Weka_classifier()
table(iris$Species, predict(m1)) # by hand
## visualization
## use party package
if(require("party", quietly = TRUE)) plot(m1)
## or Graphviz
write_to_dot(m1)
## or Rgraphviz
library("Rgraphviz")
ff <- tempfile()
write_to_dot(m1, ff)
plot(agread(ff))
## Using some Weka data sets ...
## J48
DF2 <- read.arff(system.file("arff", "contact-lenses.arff",
package = "RWeka"))
m2 <- J48(`contact-lenses` ~ ., data = DF2)
m2
table(DF2$`contact-lenses`, predict(m2))
if(require("party", quietly = TRUE)) plot(m2)
## M5P
DF3 <- read.arff(system.file("arff", "cpu.arff", package = "RWeka"))
m3 <- M5P(class ~ ., data = DF3)
m3
if(require("party", quietly = TRUE)) plot(m3)
## Logistic Model Tree.
DF4 <- read.arff(system.file("arff", "weather.arff", package = "RWeka"))
m4 <- LMT(play ~ ., data = DF4)
m4
table(DF4$play, predict(m4))
## Larger scale example.
if(require("mlbench", quietly = TRUE)
&& require("party", quietly = TRUE)) {
## Predict diabetes status for Pima Indian women
data("PimaIndiansDiabetes", package = "mlbench")
## Fit J48 tree with reduced error pruning
m5 <- J48(diabetes ~ ., data = PimaIndiansDiabetes,
control = Weka_control(R = TRUE))
plot(m5)
## (Make sure that the plotting device is big enough for the tree.)
}
```

*Documentation reproduced from package RWeka, version 0.4-18, License: GPL-2*