fastDist: "dist" object computation

Description

Efficiently computes a "dist" object from a numeric matrix using various distance metrics.

Usage

fastDist(X, method = "euclidean", diag = FALSE, upper = FALSE, p = 2L)

Value

A distance matrix of class "dist".

Arguments

X: A numeric matrix.
method: A character string specifying the distance metric to use. Supported methods include "euclidean", "manhattan", "maximum", "minkowski", "cosine", and "canberra".
diag: A boolean value, indicating whether to display the diagonal entries.
upper: A boolean value, indicating whether to display the upper triangular entries.
p: A positive integer, required for computing Minkowski distance; by default p = 2 (i.e., Euclidean).

Author

Minh Long Nguyen edelweiss611428@gmail.com

Details

Calculates pairwise distances between rows of a numeric matrix and returns the result as a compact "dist" object, which stores the lower-triangular entries of a complete distance matrix. Supports multiple distance measures, including "euclidean", "manhattan", "maximum", "minkowski", "cosine", and "canberra". This implementation is optimised for speed, especially on large matrices.

Row names are retained. If it is null, as.character(1:nrow(X)) will be used as row names instead.

Examples

Run this code


library("microbenchmark")
x = matrix(rnorm(200), nrow = 50)
microbenchmark(stats::dist(x, "minkowski", p = 5),
               fastDist(x, "minkowski", p = 5))
v1 = as.vector(stats::dist(x, "minkowski", p = 5))
v2 = as.vector(fastDist(x, "minkowski", p = 5))
all.equal(v1, v2)

Run the code above in your browser using DataLab