Shared Nearest Neighbors
Calculates the number of shared nearest neighbors.
sNN(x, k, kt = NULL, sort = TRUE, search = "kdtree", bucketSize = 10, splitRule = "suggest", approx = 0)
- a data matrix, a dist object or a kNN object.
- number of neighbors to consider to calculate the shared nearest neighbors.
- threshold on the number of shared nearest neighbors graph. Edges
are only preserved if
ktor more neighbors are shared.
- nearest neighbor search strategy (one of "kdtree", "linear" or "dist").
- sort the neighbors by distance? Note that this is expensive and
sort = FALSEis much faster. kNN objects can be sorted using
- max size of the kd-tree leafs.
- rule to split the kd-tree. One of "STD", "MIDPT", "FAIR", "SL_MIDPT", "SL_FAIR" or "SUGGEST" (SL stands for sliding). "SUGGEST" uses ANNs best guess.
- use approximate nearest neighbors. All NN up to a distance of
a factor of 1+
approxeps may be used. Some actual NN may be omitted leading to spurious clusters and noise points. However, the algorithm will enjoy a significant speedup.
The number of shared nearest neighbors is the intersection of the kNN neighborhood of two points. Note: that each point is considered to be part of its own kNN neighborhood. The range for the shared nearest neighbors is [0,k].
An object of class sNN containing a list with the following components:
data(iris) x <- iris[, -5] # finding kNN and add the number of shared nearest neighbors. k <- 5 nn <- sNN(x, k = k) nn # shared nearest neighbor distribution table(as.vector(nn$shared)) # explore neighborhood of point 10 i <- 10 nn$shared[i,] plot(nn, x) # apply a threshold to create a sNN graph with edges # if more than 3 neighbors are shared. plot(sNN(nn, kt = 3), x)