Learn R Programming

FuzzySpec (version 1.0.0)

make.adjacency: A General Framework for Adjacency Matrix Construction

Description

Builds an \(n \times n\) adjacency matrix from data using Euclidean or variable-weighted distances, with optional locally-adaptive scalings and optional variable weighting by shared-nearest-neighbours (SNN), similarity ranks (SIM), or both. The options reproduce a family of adjacency matrices including the variants described by Ghashti, Hare and Thompson (2025).

Usage

make.adjacency(data,
                method = "vw",
                isLocWeighted = FALSE,
                isModWeighted = FALSE,
                isSparse = FALSE,
                ModMethod = NULL,
                scale = FALSE,
                sig = 1,
                radius = NULL,
                cv.method = "cv.ls")

Value

An \(n \times n\) numeric adjacency matrix \(W\) with ones on the diagonal.

Arguments

data

Numeric matrix or data frame of size \(n \times p\). Required.

method

Distance construction "eu" for Euclidean; "vw" for variable-weighted scaling (see Details). Defaults to "v".

isLocWeighted

Logical. If TRUE, use locally-adaptive scalings \(\sigma_i\) (Zelnik–Perona) to self-tune the kernel; otherwise use a global scale sig. Defaults to FALSE.

isModWeighted

Logical. If TRUE, apply a weighting matrix \(M\) based on SNN and/or SIM (see Details). Defaults to FALSE

isSparse

Logical. If TRUE and isModWeighted = TRUE, then SNN (ModMethod = "snn") and/or SIM (ModMethod = "sim") arguments creates a sparse adjacency matrix (see Details). Defaults to FALSE.

ModMethod

One of "snn", "sim", or "both" when isModWeighted=TRUE.

scale

Logical; standardize columns of data before distance construction.

sig

Positive numeric value for global kernel width, used only when isLocWeighted=FALSE.

radius

Integer \(r\). If NULL, \(r\) is estimated with find.radius on the constructed distance matrix.

cv.method

Bandwidth selector for method="vw" passed to np::npudensbw; one of "cv.ml" or "cv.ls".

Mapping to named adjacency matrices

Relating to the paper by Ghashti, Hare, and Thompson (2025), let "vw" denote the variable-weighted distance method="vw" and "eu" for the traditional squared Euclidean distance; "la" denotes locally-adaptive scaling when isLocWeighted=TRUE (Zelnik-Manor and Perona, 2004); "id" denotes identity for isModWeighted = FALSE, and "sim", "snn" and "simsnn" denote weightings for \(M\) described above.

To reproduce results from the 2025 paper, below are a few examples of adjacency construction:

  • vw-id: \(\exp(-D^2/\sigma^2)\) with method="vw", isLocWeighted=FALSE, isModWeighted=FALSE.

  • vwla-id: \(\exp(-D^2/(\sigma_i\sigma_j))\) with method="vw", isLocWeighted=TRUE, isModWeighted=FALSE.

  • vw-sim: \(0.5\,\exp(-D^2/\sigma^2)\,(1+\text{SIM})\) with method="vw", isLocWeighted=FALSE, isModWeighted=TRUE, ModMethod="sim", isSparse=FALSE.

  • vw-snns: \(0.5\,\exp(-D^2/\sigma^2)\,(1+\text{SNN})\) with method="vw", isLocWeighted=FALSE, isModWeighted=TRUE, ModMethod="snn", isSparse=TRUE.

  • vw-simsnns: \(0.25\,\exp(-D^2/\sigma^2)\,(1+\text{SIM})(1+\text{SNN})\) with method="vw", isLocWeighted=FALSE, isModWeighted=TRUE, ModMethod="both", isSparse=TRUE.

Also note that:

  • If radius is NULL, \(r\) is chosen adaptively via find.radius on the constructed distance matrix.

  • method="vw" requires npudensbw for variable weighted bandwidths, with default np::npudensbw(data, bwmethod = cv.method, nmulti = 3) (Hayfield and Racine, 2008).

Notes

  • When r is determined by find.radius, we implement a modified version of the Natural Neighbors algorithm from Zhu et al. (2016).

  • SNN is a modified version of the Shared Nearest Neighbors algorithm from Jarvis and Patrick (1973).

  • More information on locally-adaptive scalings are seen in Zelnik-Manor and Perona (2004).

Details

Step 1: Distance.

  • method="eu": \(D_{ij}=\|x_i-x_j\|_2\).

  • method="vw": compute product-kernel bandwidths h via np::npudensbw, set feature-weights \(w_j=1/h_j^2\), rescale data as \(\tilde{x}_{ij}=\sqrt{w_j}\,x_{ij}\), then \(D_{ij}=\|\tilde{x}_i-\tilde{x}_j\|_2\). (Variable-weighted metric.)

Step 2:Similarity kernel.

  • Locally-adaptive scaling from Zelnik-Manor: if isLocWeighted=TRUE, compute \(\sigma_i\) as the distance to the \(r\)-th neighbour with compute.sigma and set $$S_{ij}=\exp\!\Big(-D_{ij}^2 / (\sigma_i \sigma_j)\Big),\quad S_{ii}=1.$$

  • Global scale: if isLocWeighted=FALSE, $$S_{ij}=\exp\!\Big(-D_{ij}^2 / \sigma^2\Big),\quad S_{ii}=1.$$

Step 3: Weighting Matrix (optional).

Let \(\text{SNN}_{ij}\) be the shared-\(r\)-NN overlap fraction (see compute.SNN) for observations \(i\) and \(j\), and \(\rho_i\) be the \((r+1)\)-largest entry of observation \(i\) in matrix \(S\). Define \(\text{SIM}_{ij}=\sqrt{\rho_i \rho_j}\). For isModWeighted=TRUE we have the following options

  • ModMethod="snn": \(M_{ij} = \begin{cases} 0.5\,(1+\text{SNN}_{ij}), & \text{if } \texttt{isSparse=FALSE},\\ \text{SNN}_{ij}, & \text{if } \texttt{isSparse=TRUE}. \end{cases}\)

  • ModMethod="sim": \(M_{ij} = \begin{cases} 0.5\,(1+\text{SIM}_{ij}), & \text{if } \texttt{isSparse=FALSE},\\ \text{SIM}_{ij}, & \text{if } \texttt{isSparse=TRUE}. \end{cases}\)

  • ModMethod="both": \(M_{ij} = \begin{cases} 0.25\,(1+\text{SIM}_{ij})\,(1+\text{SNN}_{ij}), & \text{if } \texttt{isSparse=FALSE},\\ \text{SIM}_{ij}\cdot \text{SNN}_{ij}, & \text{if } \texttt{isSparse=TRUE}. \end{cases}\)

The returned adjacency is \(W = S \circ M\) when isModWeighted = TRUE, otherwise \(W=S\). These choices align with the table of named adjacency matrices (see Mapping below).

References

Ghashti, J. S., Hare, W., and J. R. J. Thompson (2025). Variable-weighted adjacency constructions for fuzzy spectral clustering. Submitted.

Hayfield, T., and J. S. Racine (2008). Nonparametric Econometrics: The np Package. Journal of Statistical Software 27(5).

Jarvis, R. A., and A. E. Patrick (1973). Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on Computers, 22(11), 1025-1034.

Zelnik-Manor, L., and P. Perona (2004). Self-tuning spectral clustering. Advances in Neural Information Processing Systems, 17.

Zhu, Q., Feng, J., and J. Huang (2016). Natural neighbor: A self-adaptive neighborhood method without parameter K. Pattern Recognition Letters, 80, 30-36.

See Also

gen.fuzzy, plot.fuzzy, rNN.dist, find.radius, compute.sigma, compute.SNN, fuzzy.spectral.clustering, npudensbw

Examples

Run this code
set.seed(1)
X <- scale(matrix(rnorm(200), 100, 2))


W1 <- make.adjacency(X,
                    method = "eu",
                    isLocWeighted = TRUE) # "eula-id" named adjacency

W2 <- make.adjacency(X,
                    method = "vw",
                    isLocWeighted = TRUE) # "vwla-id" named adjacency

# compare W(xi,xj) i,j = 1,...,5 for eu/vw pair W1 and W2
W1[1:5,1:5]
W2[1:5,1:5]

W3 <- make.adjacency(X,
                    method = "eu",
                    isLocWeighted = TRUE,
                    isModWeighted = TRUE,
                    ModMethod = "snn",
                    isSparse = FALSE) # "eula-snn" named adjacency

W4 <- make.adjacency(X,
                    method = "vw",
                    isLocWeighted = TRUE,
                    isModWeighted = TRUE,
                    ModMethod = "snn",
                    isSparse = FALSE) # "vwla-snn" named adjacency

# compare W(xi,xj) i,j = 1,...,5 for eu/vw pair W3 and W4
W3[1:5,1:5]
W4[1:5,1:5]

Run the code above in your browser using DataLab