dbscan (version 1.1-3)

sNN: Shared Nearest Neighbors

Description

Calculates the number of shared nearest neighbors.

Usage

sNN(x, k, kt = NULL, sort = TRUE, search = "kdtree", bucketSize = 10,
  splitRule = "suggest", approx = 0)

Arguments

x

a data matrix, a dist object or a kNN object.

k

number of neighbors to consider to calculate the shared nearest neighbors.

kt

threshold on the number of shared nearest neighbors graph. Edges are only preserved if kt or more neighbors are shared.

search

nearest neighbor search strategy (one of "kdtree", "linear" or "dist").

sort

sort the neighbors by distance? Note that this is expensive and sort = FALSE is much faster. kNN objects can be sorted using sort().

bucketSize

max size of the kd-tree leafs.

splitRule

rule to split the kd-tree. One of "STD", "MIDPT", "FAIR", "SL_MIDPT", "SL_FAIR" or "SUGGEST" (SL stands for sliding). "SUGGEST" uses ANNs best guess.

approx

use approximate nearest neighbors. All NN up to a distance of a factor of 1+approx eps may be used. Some actual NN may be omitted leading to spurious clusters and noise points. However, the algorithm will enjoy a significant speedup.

Value

An object of class sNN (subclass of kNN and NN) containing a list with the following components:

id

a matrix with ids.

dist

a matrix with the distances.

shared

a matrix with the number of shared nearest neighbors.

k

number of k used.

%% ...

Details

The number of shared nearest neighbors is the intersection of the kNN neighborhood of two points. Note: that each point is considered to be part of its own kNN neighborhood. The range for the shared nearest neighbors is [0,k].

See Also

NN and kNN for k nearest neighbors.

Examples

Run this code
# NOT RUN {
data(iris)
x <- iris[, -5]

# finding kNN and add the number of shared nearest neighbors.
k <- 5
nn <- sNN(x, k = k)
nn

# shared nearest neighbor distribution
table(as.vector(nn$shared))

# explore neighborhood of point 10
i <- 10
nn$shared[i,]

plot(nn, x)

# apply a threshold to create a sNN graph with edges
# if more than 3 neighbors are shared.
plot(sNN(nn, kt = 3), x)
# }

Run the code above in your browser using DataCamp Workspace