# NAN: Natural Neighbor (NAN) algorithm to return the self-adaptive neighborhood

## Description

Function to identify natural neighbors and the right k-parameter for kNN graphs as suggested by Zhu, Q., Feng, Ji. & Huang, J. (2016)

## Usage

NAN(dataset, NaN_Edges = FALSE)

## Arguments

dataset

The dataset for which natural neighbors are identified along with a k-parameter

NaN_Edges

Choice for computing natural neighbors. Computational heavy to compute

## Value

NaN_NumThe number of in-degrees for observations given r

rNatural neighbor eigenvalue. Useful as k-parameter

NaN_EdgesMatrix of edges for natural neighbors

n_NaNThe number of natural neighbors

## Details

NAN computes the natural neighbor eigenvalue and identifies natural neighbors in a dataset. The natural neighbor eigenvalue is powerful as k-parameter for computing a k-nearest neighborhood, being suitable for outlier detection, clustering or predictive modelling. Natural neighbors are defined as two observations being mutual k-nearest neighbors.
A kd-tree is used for kNN computation, using the kNN() function from the 'dbscan' package

## References

Zhu, Q., Feng, Ji. & Huang, J. (2016). Natural neighbor: A self-adaptive neighborhood method without parameter K. Pattern Recognition Letters. pp. 30-36. DOI: 10.1016/j.patrec.2016.05.007

## Examples

# NOT RUN {
# Select dataset
X <- iris[,1:4]
# Identify the right k-parameter
K <- NAN(X, NaN_Edges=FALSE)$r
# Use the k-setting in an abitrary outlier detection algorithm
outlier_score <- LOF(dataset=X, k=K)
# Sort and find index for most outlying observations
names(outlier_score) <- 1:nrow(X)
sort(outlier_score, decreasing = TRUE)
# Inspect the distribution of outlier scores
hist(outlier_score)
# }