stream (version 2.0-1)

DSC_DBSCAN: DBSCAN Macro-clusterer

Description

Macro Clusterer. Implements the DBSCAN algorithm for reclustering micro-clusterings.

Usage

DSC_DBSCAN(
  formula = NULL,
  eps,
  MinPts = 5,
  weighted = TRUE,
  description = NULL
)

Value

An object of class DSC_DBSCAN (a subclass of DSC, DSC_R, DSC_Macro).

Arguments

formula

NULL to use all features in the stream or a model formula of the form ~ X1 + X2 to specify the features used for clustering. Only ., + and - are currently supported in the formula.

eps

radius of the eps-neighborhood.

MinPts

minimum number of points required in the eps-neighborhood.

weighted

logical indicating if a weighted version of DBSCAN should be used.

description

optional character string to describe the clustering method.

Author

Michael Hahsler

Details

DBSCAN is a weighted extended version of the implementation in fpc where each micro-cluster center considered a pseudo point. For weighting we use in the MinPts comparison the sum of weights of the micro-cluster instead of the number.

DBSCAN first finds core points based on the number of other points in its eps-neighborhood. Then core points are joined into clusters using reachability (overlapping eps-neighborhoods).

update() and recluster() invisibly return the assignment of the data points to clusters.

Note that this clustering cannot be updated iteratively and every time it is used for (re)clustering, the old clustering is deleted.

References

Martin Ester, Hans-Peter Kriegel, Joerg Sander, Xiaowei Xu (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Evangelos Simoudis, Jiawei Han, Usama M. Fayyad. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226-231.

See Also

Other DSC_Macro: DSC_EA(), DSC_Hierarchical(), DSC_Kmeans(), DSC_Macro(), DSC_Reachability(), DSC_SlidingWindow()

Examples

Run this code
# 3 clusters with 5% noise
stream <- DSD_Gaussians(k = 3, d = 2, noise = 0.05)

# Use a moving window for "micro-clusters and recluster with DBSCAN (macro-clusters)
cl <- DSC_TwoStage(
  micro = DSC_Window(horizon = 100),
  macro = DSC_DBSCAN(eps = .05)
)

update(cl, stream, 500)
cl

plot(cl, stream)

Run the code above in your browser using DataLab