Learn R Programming

dbscan (version 1.2.4)

jpclust: Jarvis-Patrick Clustering

Description

Fast C++ implementation of the Jarvis-Patrick clustering which first builds a shared nearest neighbor graph (k nearest neighbor sparsification) and then places two points in the same cluster if they are in each others nearest neighbor list and they share at least kt nearest neighbors.

Usage

jpclust(x, k, kt, ...)

Arguments

Value

A object of class general_clustering with the following components:

cluster

A integer vector with cluster assignments. Zero indicates noise points.

type

name of used clustering algorithm.

metric

the distance metric used for clustering.

param

list of used clustering parameters.

Details

Following the original paper, the shared nearest neighbor list is constructed as the k neighbors plus the point itself (as neighbor zero). Therefore, the threshold kt needs to be in the range \([1, k]\).

Fast nearest neighbors search with kNN() is only used if x is a matrix. In this case Euclidean distance is used.

References

R. A. Jarvis and E. A. Patrick. 1973. Clustering Using a Similarity Measure Based on Shared Near Neighbors. IEEE Trans. Comput. 22, 11 (November 1973), 1025-1034. tools:::Rd_expr_doi("10.1109/T-C.1973.223640")

See Also

Other clustering functions: dbscan(), extractFOSC(), hdbscan(), ncluster(), optics(), sNNclust()

Examples

Run this code
data("DS3")

# use a shared neighborhood of 20 points and require 12 shared neighbors
cl <- jpclust(DS3, k = 20, kt = 12)
cl

clplot(DS3, cl)
# Note: JP clustering does not consider noise and thus,
# the sine wave points chain clusters together.

# use a precomputed kNN object instead of the original data.
nn <- kNN(DS3, k = 30)
nn

cl <- jpclust(nn, k = 20, kt = 12)
cl

# cluster with noise removed (use low pointdensity to identify noise)
d <- pointdensity(DS3, eps = 25)
hist(d, breaks = 20)
DS3_noiseless <- DS3[d > 110,]

cl <- jpclust(DS3_noiseless, k = 20, kt = 10)
cl

clplot(DS3_noiseless, cl)

Run the code above in your browser using DataLab