expandDist: Expanding a distance matrix given new data

Description

Efficiently appends new "rows" to an existing "dist" object without explicitly recomputing a full pairwise distance matrix.

Usage

expandDist(distA, A, B, method = "euclidean", diag = FALSE, upper = FALSE, p = 2)

Value

A distance matrix of class "dist" for rbind(A,B).

Arguments

distA: A "dist" object, representing the pairwise distance matrix between observations in matrix A, ideally computed via the distance metric specified in this function. This requires manual check.
A: A numeric matrix.
B: A numeric matrix.
method: A character string specifying the distance metric to use. Supported methods include "euclidean", "manhattan", "maximum", "minkowski", "cosine", and "canberra".
diag: A boolean value, indicating whether to display the diagonal entries.
upper: A boolean value, indicating whether to display the upper triangular entries.
p: A positive integer, required for computing Minkowski distance; by default p = 2 (i.e., Euclidean).

Author

Minh Long Nguyen edelweiss611428@gmail.com

Details

Expands an existing distance matrix of class "dist" for matrix A, given new data B, without explicitly computing the distance matrix of rbind(A,B). This supports multiple commonly used distance measures and is optimised for speed.

Row names are retained. If either rownames(A) or rownames(B) is null, as.character(1:(nrow(A)+nrow(B))) will be used as row names instead.

Examples

Run this code


A = matrix(rnorm(100), nrow = 20)
B = matrix(rnorm(250), nrow = 50)
AB = rbind(A,B)
distA = fastDist(A)
v1 = as.vector(expandDist(distA, A, B))
v2 = as.vector(fastDist(AB))
all.equal(v1, v2)

Run the code above in your browser using DataLab