# Weka_clusterers

##### R/Weka Clusterers

R interfaces to Weka clustering algorithms.

- Keywords
- cluster

##### Usage

```
Cobweb(x, control = NULL)
FarthestFirst(x, control = NULL)
SimpleKMeans(x, control = NULL)
XMeans(x, control = NULL)
DBScan(x, control = NULL)
```

##### Arguments

- x
an R object with the data to be clustered.

- control
an object of class

`Weka_control`

, or a character vector of control options, or`NULL`

(default). Available options can be obtained on-line using the Weka Option Wizard`WOW`

, or the Weka documentation.

##### Details

There is a `predict`

method for
predicting class ids or memberships from the fitted clusterers.

`Cobweb`

implements the Cobweb (Fisher, 1987) and Classit
(Gennari et al., 1989) clustering algorithms.

`FarthestFirst`

provides the “farthest first traversal
algorithm” by Hochbaum and Shmoys, which works as a fast simple
approximate clusterer modeled after simple \(k\)-means.

`SimpleKMeans`

provides clustering with the \(k\)-means
algorithm.

`XMeans`

provides \(k\)-means extended by an
“Improve-Structure part” and automatically determines the
number of clusters.

`DBScan`

provides the “density-based clustering algorithm”
by Ester, Kriegel, Sander, and Xu. Note that noise points are assigned
to `NA`

.

##### Value

A list inheriting from class `Weka_clusterers`

with components
including

a reference (of class
`jobjRef`

) to a Java object
obtained by applying the Weka `buildClusterer`

method to the
training instances using the given control options.

a vector of integers indicating the class to which
each training instance is allocated (the results of calling the Weka
`clusterInstance`

method for the built clusterer and each
instance).

##### Note

`XMeans`

requires Weka package XMeans to be installed.

`DBScan`

requires Weka package optics_dbScan to be
installed.

##### References

M. Ester, H.-P. Kriegel, J. Sander, and X. Xu (1996).
A Density-Based Algorithm for Discovering Clusters in Large Spatial
Databases with Noise.
*Proceedings of the Second International Conference on Knowledge
Discovery and Data Mining (KDD'96)*,
Portland, OR, 226--231.
AAAI Press.

D. H. Fisher (1987).
Knowledge acquisition via incremental conceptual clustering.
*Machine Learning*, **2**/2, 139--172.

J. Gennari, P. Langley, and D. H. Fisher (1989).
Models of incremental concept formation.
*Artificial Intelligence*, **40**, 11--62.

D. S. Hochbaum and D. B. Shmoys (1985).
A best possible heuristic for the \(k\)-center problem,
*Mathematics of Operations Research*, **10**(2), 180--184.

D. Pelleg and A. W. Moore (2006).
X-means: Extending K-means with Efficient Estimation of the Number of
Clusters.
In: *Seventeenth International Conference on Machine Learning*,
727--734.
Morgan Kaufmann.

I. H. Witten and E. Frank (2005).
*Data Mining: Practical Machine Learning Tools and Techniques*.
2nd Edition, Morgan Kaufmann, San Francisco.

##### Examples

```
# NOT RUN {
cl1 <- SimpleKMeans(iris[, -5], Weka_control(N = 3))
cl1
table(predict(cl1), iris$Species)
# }
# NOT RUN {
## Requires Weka package 'XMeans' to be installed.
## Use XMeans with a KDTree.
cl2 <- XMeans(iris[, -5],
c("-L", 3, "-H", 7, "-use-kdtree",
"-K", "weka.core.neighboursearch.KDTree -P"))
cl2
table(predict(cl2), iris$Species)
# }
```

*Documentation reproduced from package RWeka, version 0.4-40, License: GPL-2*