Learn R Programming

kKNN: Adaptive k-Nearest Neighbor Classifier

Project Overview

The LCCkNN R package implements an adaptive k-nearest neighbor (k-NN) classifier based on local curvature estimation. Unlike the traditional k-NN algorithm, which uses a fixed number of neighbors ($k$) for all data points, the kK-NN algorithm dynamically adjusts the neighborhood size for each sample.

The core idea is that data points with low curvature could have larger neighborhoods, as the tangent space approximates well the underlying data shape. Conversely, points with high curvature could have smaller neighborhoods, because the tangent space is a loose approximation. This adaptive strategy is capable of avoiding both underfitting and overfitting, and improves classification performance, especially when dealing with a limited number of training samples. The algorithm's key components include:

  • Curvature Estimation: The local Gaussian curvature is estimated by approximating the local shape operator in terms of the local covariance matrix and the local Hessian matrix. This approximation for the local shape operator of the data manifold improves the robustness to noise and outliers.
  • Adaptive Neighborhood Sizing: The curvatures are quantized into ten different scores. Based on these scores, the adaptive neighborhood adjustment is performed by pruning the edges of the k-NN graph, reducing the neighborhood size for high-curvature points and retaining a larger neighborhood for low-curvature points.
  • Improved Performance: Results on many real-world datasets indicate that the kK-NN algorithm yields superior balanced accuracy compared to the established k-NN method and another adaptive k-NN algorithm. The results consistently show that the kK-NN classifier is superior to the regular k-NN and also the competing adaptive k-NN algorithm.

Key Feature: Dual Quantization Support

This package offers two distinct quantization methods for adaptive neighborhood sizing:

Paper Method: The default approach, which quantizes curvatures into ten different scores (from 0 to 9), as described in the original paper.

Log2n Method: A slightly modified approach, also developed by the same author, that quantizes curvatures into k scores, where k is the number of neighbors, which is calculated as k= log2n.

This dual-method support allows users to test and compare both quantization strategies within a single package.

Copy Link

Version

Install

install.packages('LCCkNN')

Monthly Downloads

111

Version

0.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Gabriel Pereira

Last Published

August 27th, 2025

Functions in LCCkNN (0.1.0)

sigmoid

A helper sigmoid function.
curvature_estimation

Computes the curvatures of all samples in the training set.
balanced_accuracy_score

Computes balanced accuracy.
f1_score

Computes the F1-score.
quantize

Quantizes real values to integer levels.
point_curvature_estimation

Computes the curvature of a single test sample's neighborhood.
testa_KNN

Standard k-NN classifier.
kKNN

Adaptive k-Nearest Neighbor Classifier