Learn R Programming

dbrobust (version 1.0.0)

make_euclidean: Force a Pairwise Squared Distance Matrix to Euclidean Form

Description

Given a pairwise squared distance matrix \(D\) (where \(D[i,j] = d(i,j)^2\)), this function ensures that \(D\) corresponds to a valid Euclidean squared distance matrix. The correction is based on the weighted Gram matrix \(G_w = -\frac{1}{2} J_w D J_w^\top\), where \(J_w = I_n - \mathbf{1} w^\top\) is the centering matrix defined by the weight vector \(w\).

Usage

make_euclidean(D, w, tol = 1e-10)

Value

A list with components:

D_euc

Corrected pairwise squared Euclidean distance matrix (n x n).

eigvals_before

Eigenvalues of the weighted Gram matrix before correction.

eigvals_after

Eigenvalues of the weighted Gram matrix after correction.

transformed

Logical, TRUE if correction was applied, FALSE otherwise.

Arguments

D

Numeric square matrix (n x n) of pairwise squared distances. Must be symmetric with zeros on the diagonal.

w

Numeric vector of weights (length n). Internally normalized to sum to 1.

tol

Numeric tolerance for detecting negative eigenvalues (default: 1e-10).

Details

If the smallest eigenvalue \(\lambda_{\min}\) of \(G_w\) is below the negative tolerance -tol, the function corrects \(D\) by adding a constant shift to guarantee positive semi-definiteness of the Gram matrix, following the approach of lingoes1971somedbrobust and mardia1978somedbrobust: $$ D_{\text{new}} = D + 2 c \mathbf{1} \mathbf{1}^\top - 2 c I_n, $$ where \(c = |\lambda_{\min}|\).

References

lingoes1971somedbrobust mardia1978somedbrobust

See Also

Examples

Run this code
# Load example dataset
data("Data_HC_contamination")

# Reduce dataset to first 50 rows
Data_small <- Data_HC_contamination[1:50, ]

# Select only continuous variables
cont_vars <- names(Data_small)[1:4]
Data_cont <- Data_small[, cont_vars]

# Compute squared Euclidean distance matrix
dist_mat <- as.matrix(dist(Data_cont))^2

# Introduce a small non-Euclidean distortion
dist_mat[1, 2] <- dist_mat[1, 2] * 0.5
dist_mat[2, 1] <- dist_mat[1, 2]

# Uniform weights
weights <- rep(1, nrow(Data_cont))

# Apply Euclidean correction
res <- make_euclidean(dist_mat, weights)

# Check results (minimum eigenvalues before/after)
res$transformed
min(res$eigvals_before)
min(res$eigvals_after)

# First 5x5 block of corrected matrix
round(res$D_euc[1:5, 1:5], 4)

Run the code above in your browser using DataLab