stream (version 1.2-3)

DSC_Hierarchical: Hierarchical Micro-Cluster Reclusterer

Description

Implementation of hierarchical clustering to recluster a set of micro-clusters.

Usage

DSC_Hierarchical(k=NULL, h=NULL, method = "complete", min_weight=NULL, description=NULL)

Arguments

k
The number of desired clusters.
h
Height where to cut the dendrogram.
method
the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid".
min_weight
micro-clusters with a weight less than this will be ignored for reclustering.
description
optional character string to describe the clustering method.

Value

A list of class DSC, DSC_R, DSC_Macro, and DSC_Hierarchical. The list contains the following items:

Details

Please refer to hclust for more details on the behavior of the algorithm.

Note that this clustering cannot be updated iteratively and every time it is used for (re)clustering, the old clustering is deleted.

See Also

DSC, DSC_Macro

Examples

Run this code
# Cassini dataset
stream <- DSD_mlbenchGenerator("cassini")

# Use hierarchical clustering to recluster micro-clusters
dbstream <- DSC_DBSTREAM(r=.05)
update(dbstream, stream, 500)

# reclustering using single-link and specifying k
hc <- DSC_Hierarchical(k=3, method="single")
recluster(hc, dbstream)
hc
plot(hc, stream, type="both")

# reclustering by specifying height
hc <- DSC_Hierarchical(h=.2, method="single")
recluster(hc, dbstream)
hc
plot(hc, stream, type="both")

# For comparison we use hierarchical clustering directly on the data 
# Note: hierarchical clustering is not a data stream clustering algorithm!
hc <- DSC_Hierarchical(k=3, method="single")
update(hc, stream, 500)
plot(hc, stream)

Run the code above in your browser using DataLab