# Weka_classifier_trees

##### R/Weka Classifier Trees

R interfaces to Weka regression and classification tree learners.

- Keywords
- models, regression, classif, tree

##### Usage

```
J48(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
LMT(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
M5P(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
DecisionStump(formula, data, subset, na.action,
control = Weka_control(), options = NULL)
```

##### Arguments

- formula
a symbolic description of the model to be fit.

- data
an optional data frame containing the variables in the model.

- subset
an optional vector specifying a subset of observations to be used in the fitting process.

- na.action
a function which indicates what should happen when the data contain

`NA`

s. See`model.frame`

for details.- control
an object of class

`Weka_control`

giving options to be passed to the Weka learner. Available options can be obtained on-line using the Weka Option Wizard`WOW`

, or the Weka documentation.- options
a named list of further options, or

`NULL`

(default). See**Details**.

##### Details

There are a `predict`

method for
predicting from the fitted models, and a `summary`

method based
on `evaluate_Weka_classifier`

.

There is also a `plot`

method for fitted binary `Weka_tree`

s
via the facilities provided by package partykit. This converts
the `Weka_tree`

to a `party`

object and then simply calls
the plot method of this class (see `plot.party`

).

Provided the Weka classification tree learner implements the
“Drawable” interface (i.e., provides a `graph`

method),
`write_to_dot`

can be used to create a DOT representation
of the tree for visualization via Graphviz or the Rgraphviz
package.

`J48`

generates unpruned or pruned C4.5 decision trees (Quinlan,
1993).

`LMT`

implements “Logistic Model Trees” (Landwehr, 2003;
Landwehr et al., 2005).

`M5P`

(where the `P` stands for ‘prime’) generates M5
model trees using the M5' algorithm, which was introduced in Wang &
Witten (1997) and enhances the original M5 algorithm by Quinlan
(1992).

`DecisionStump`

implements decision stumps (trees with a single
split only), which are frequently used as base learners for meta
learners such as Boosting.

The model formulae should only use the `+` and `-` operators
to indicate the variables to be included or not used, respectively.

Argument `options`

allows further customization. Currently,
options `model`

and `instances`

(or partial matches for
these) are used: if set to `TRUE`

, the model frame or the
corresponding Weka instances, respectively, are included in the fitted
model object, possibly speeding up subsequent computations on the
object. By default, neither is included.

`parse_Weka_digraph`

can parse the graph associated with a Weka
tree classifier (and obtained by invoking its `graph()`

method in
Weka), returning a simple list with nodes and edges.

##### Value

A list inheriting from classes `Weka_tree`

and
`Weka_classifiers`

with components including

a reference (of class
`jobjRef`

) to a Java object
obtained by applying the Weka `buildClassifier`

method to build
the specified model using the given control options.

a numeric vector or factor with the model
predictions for the training instances (the results of calling the
Weka `classifyInstance`

method for the built classifier and
each instance).

the matched call.

##### References

N. Landwehr (2003).
*Logistic Model Trees*.
Master's thesis, Institute for Computer Science, University of
Freiburg, Germany.
http://www.cs.uni-potsdam.de/ml/landwehr/diploma_thesis.pdf

N. Landwehr, M. Hall, and E. Frank (2005).
Logistic Model Trees.
*Machine Learning*, **59**, 161--205.

R. Quinlan (1993).
*C4.5: Programs for Machine Learning*.
Morgan Kaufmann Publishers, San Mateo, CA.

R. Quinlan (1992).
Learning with continuous classes.
*Proceedings of the Australian Joint Conference on Artificial
Intelligence*, 343--348.
World Scientific, Singapore.

Y. Wang and I. H. Witten (1997).
Induction of model trees for predicting continuous classes.
*Proceedings of the European Conference on Machine
Learning*.
University of Economics, Faculty of Informatics and Statistics,
Prague.

I. H. Witten and E. Frank (2005).
*Data Mining: Practical Machine Learning Tools and Techniques*.
2nd Edition, Morgan Kaufmann, San Francisco.

##### See Also

##### Examples

```
# NOT RUN {
m1 <- J48(Species ~ ., data = iris)
## print and summary
m1
summary(m1) # calls evaluate_Weka_classifier()
table(iris$Species, predict(m1)) # by hand
## visualization
## use partykit package
if(require("partykit", quietly = TRUE)) plot(m1)
## or Graphviz
write_to_dot(m1)
## or Rgraphviz
# }
# NOT RUN {
library("Rgraphviz")
ff <- tempfile()
write_to_dot(m1, ff)
plot(agread(ff))
# }
# NOT RUN {
## Using some Weka data sets ...
## J48
DF2 <- read.arff(system.file("arff", "contact-lenses.arff",
package = "RWeka"))
m2 <- J48(`contact-lenses` ~ ., data = DF2)
m2
table(DF2$`contact-lenses`, predict(m2))
if(require("partykit", quietly = TRUE)) plot(m2)
## M5P
DF3 <- read.arff(system.file("arff", "cpu.arff", package = "RWeka"))
m3 <- M5P(class ~ ., data = DF3)
m3
if(require("partykit", quietly = TRUE)) plot(m3)
## Logistic Model Tree.
DF4 <- read.arff(system.file("arff", "weather.arff", package = "RWeka"))
m4 <- LMT(play ~ ., data = DF4)
m4
table(DF4$play, predict(m4))
## Larger scale example.
if(require("mlbench", quietly = TRUE)
&& require("partykit", quietly = TRUE)) {
## Predict diabetes status for Pima Indian women
data("PimaIndiansDiabetes", package = "mlbench")
## Fit J48 tree with reduced error pruning
m5 <- J48(diabetes ~ ., data = PimaIndiansDiabetes,
control = Weka_control(R = TRUE))
plot(m5)
## (Make sure that the plotting device is big enough for the tree.)
}
# }
```

*Documentation reproduced from package RWeka, version 0.4-40, License: GPL-2*