Learn R Programming

dbscan (version 1.0-0)

jpclust: Jarvis-Patrick Clustering

Description

Fast C++ implementation of the Jarvis-Patrick clustering which first builds a shared nearest neighbor graph (k nearest neighbor sparsification) and then places two points in the same cluster if they are in each other's nearest neighbor list and they share at least kt nearest neighbors.

Usage

jpclust(x, k, kt, ...)

Arguments

x
a data matrix/data.frame (Euclidean distance is used), a precomputed dist object or a kNN object created with kNN().
k
Neighborhood size for nearest neighbor sparsification. If x is a kNN object then k may be missing.
kt
threshold on the number of shared nearest neighbors (including the points themselves) to form clusters.
...
additional arguments are passed on to the k nearest neighbor search algorithm. See kNN for details on how to control the search strategy.

Value

A vector with cluster assignments.

Details

Note: Following the original paper, the shared nearest neighbor list is constructed as the k neighbors plus the point itself (as neighbor zero). Therefore, the threshold kt can be in the range [1, k].

Fast nearest neighbors search with kNN() is only used if x is a matrix. In this case Euclidean distance is used.

References

R. A. Jarvis and E. A. Patrick. 1973. Clustering Using a Similarity Measure Based on Shared Near Neighbors. IEEE Trans. Comput. 22, 11 (November 1973), 1025-1034.

See Also

kNN

Examples

Run this code
data(iris)
iris <- as.matrix(iris[,1:4]) ### only use numeric attributes

# use a shared neighborhood of 10 points and require 6 shared neighbors
cl <- jpclust(iris, k = 10, kt = 6)
pairs(iris, col = cl)

# use a precomputed kNN object instead of the original data.
nn <- kNN(iris, k = 15)
nn

cl <- jpclust(nn, k = 10, kt = 6)

Run the code above in your browser using DataLab