streamMOA (version 1.1-3)

DSC_ClusTree: ClusTree Data Stream Clusterer

Description

Class implements the ClusTree cluster algorithm for data streams.

Usage

DSC_ClusTree(horizon = 1000, maxHeight = 8, lambda = NULL)

Arguments

horizon

Range of the (time) window.

maxHeight

The maximum height of the tree.

lambda

number used to override computed lambda (decay).

Value

An object of class DSC_ClusTree (subclass of DSC, DSC_MOA, DSC_Micro).

Details

This is an interface to the MOA implementation of ClusTree.

References

Philipp Kranen, Ira Assent, Corinna Baldauf, and Thomas Seidl. 2009. Self-Adaptive Anytime Stream Clustering. In Proceedings of the 2009 Ninth IEEE International Conference on Data Mining (ICDM '09). IEEE Computer Society, Washington, DC, USA, 249-258. DOI=10.1109/ICDM.2009.47 http://dx.doi.org/10.1109/ICDM.2009.47

Bifet A, Holmes G, Pfahringer B, Kranen P, Kremer H, Jansen T, Seidl T (2010). MOA: Massive Online Analysis, a Framework for Stream Classification and Clustering. In Journal of Machine Learning Research (JMLR).

See Also

DSC, DSC_Micro, DSC_MOA

Examples

Run this code
# NOT RUN {
# data with 3 clusters and 5% noise
stream <- DSD_Gaussians(k=3, d=2, noise=0.05)

clustree <- DSC_ClusTree(maxHeight=3)
update(clustree, stream, 500)
clustree

# plot micro-clusters
plot(clustree, stream)

# recluster with k-means
kmeans <- DSC_Kmeans(k=3)
recluster(kmeans, clustree)
plot(kmeans, stream, type="both")

# create a two stage clusering using ClusTree and reachability reclustering
CTxReach <- DSC_TwoStage(
  micro=DSC_ClusTree(maxHeight=3),
  macro=DSC_Reachability(epsilon = .15)
)
CTxReach
update(CTxReach, stream, 500)
plot(CTxReach, stream, type="both")
# }

Run the code above in your browser using DataCamp Workspace