get.NN: Function to find the nearest neighbours

Description

This function locates the nearest neighbours of each point in the test set in the training set. Both sets must of the same dimensions and are passed as successive rows of the same matrix P.

User can decide whether a specified number of neighbours should be sought, or whether they should be sought as some fraction of the size of the training set.

Usage

get.NN(P, k = 2, p = !k, test, train, dist.type = c("euclidean",
"absolute", "mahal"), nn.type = c("which", "dist", "max"))

Arguments

The matrix of data. Contains both the training and test sets.

The number of nearest neighbours sought.

The number of nearest neighbours sought, specified as a fraction of the training set.

test

The rows of the matrix P that contain the test data.

train

The rows of the matrix P that contain the training data.

dist.type

The type of distance to use when determining neighbours.

nn.type

What should be returned? Either the actual distances (dist) or their locations (rows) in P (which) or the k-th maximum distances max

Value

Returns a matrix of dimensions (Number of Nearest Neighbours) x (Rows in Test Set). Each column contains the nearest neighbours of the corresponding row in the training set.

Details

This function is used internally to compute the nearest neighbours; the user need not call any of these functions directly.

Examples

Run this code

require(MASS)
mu <- c(3,4)
Sigma <- rbind(c(1,0.2),c(0.2,1))
Y <- mvrnorm(20, mu = mu, Sigma = Sigma)
test <- 1:4
train <- 5:20
nn1a <- get.NN(Y, k = 3, test = 1:4, train = 5:20, dist.type =
'euclidean', nn.type = 'which')
nn1b <- get.NN(Y, k = 3, test = 1:4, train = 5:20, dist.type =
'euclidean', nn.type = 'dist')
nn1c <- get.NN(Y, k = 3, test = 1:4, train = 5:20, dist.type =
'euclidean', nn.type = 'max')
nn2 <- get.NN(Y, p = 0.3, test = 1:4, train = 5:20, dist.type =
'euclidean', nn.type = 'which')

Run the code above in your browser using DataLab