fuzzy.spectral.clustering: Fuzzy Spectral Clustering with Normalized Eigenvectors

Description

Implementation of the FVIBES algorithm by Ghashti, Hare, and Thompson (2025). Performs spectral clustering on a similarity (adjacency) matrix and returns either fuzzy c-means memberships or Gaussian mixture posterior probabilities computed on the leading normalized eigenvectors.

Usage

fuzzy.spectral.clustering(W = NULL,
                          k = NULL,
                          m = NULL,
                          method = "CM",
                          nstart = 10,
                          max.iter = 1000)

Value

A list with components:

cluster: An integer vector of length \(n\) hard cluster labels.
u: An \(n \times k\) matrix of fuzzy cluster memberships: for "CM", fuzzy c-means memberships \(U\); for "GMM", posterior probabilities \(Z\).
evecs: The \(n \times k\) matrix \(Y\) of row-normalized leading eigenvectors, i.e., the spectral embedding.
centers: Cluster centers for the embedding matrix \(Y\).

Arguments

W: A nonnegative \(n \times n\) similarity (adjacency) matrix. Diagonal entries are set to 0 internally. Required.
k: Integer number of clusters. Required.
m: Fuzzy parameter for c-means, only used when method = "CM". When not provided, algorithm will set m = 2.
method: Clustering method applied to the spectral embedding with "CM" for fuzzy c-means with fclust, or "GMM" for Gaussian mixtures with mclust. Default is "CM".
nstart: Number of random starts for fclust::FKM when method = "CM".
max.iter: Maximum number of iterations for fclust::FKM when method = "CM".

Details

Let \(D\) be the diagonal degree matrix with \(D_{ii} = \sum_j W_{ij}\). The routine forms the symmetrically normalized similarity \(L = D^{-1/2} W D^{-1/2},\) (Ng, Jordan, and Weiss, 2001) computes its top \(k\) eigenvectors, stacks them in \(X \in \mathbb{R}^{n \times k}\), and row-normalizes to \(Y\) with \(Y_{i\cdot} = X_{i\cdot} / \|X_{i\cdot}\|_2\). Clustering is then performed in the rows of \(Y\).

When method = "CM", clustering uses c-means (Bezdek, 1981) with fclust::FKM on \(Y\) with fuzzy parameter m, number of starts nstart, and maximum iterations max.iter. When method = "GMM", clustering uses Gaussian mixture models (see McLachlan and Krishnan, 2008) with mclust::Mclust with G = k on \(Y\).

References

J.C. Bezdek (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York.

Ferraro, M.B., Giordani, P., and A. Serafini (2019). fclust: An R Package for Fuzzy Clustering. The R Journal, 11.

Ghashti, J. S., Hare, W., and J. R. J. Thompson (2025). Variable-weighted adjacency constructions for fuzzy spectral clustering. Submitted.

McLachlan, G. and T. Krishnan (2008). The EM algorithm and extensions, Second Edition. John Wiley & Sons.

Ng, A., Jordan, M., and Y. Weiss (2001). On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems, 14.

Scrucca, L., Fraley, C., Murphy, T.B., and A. E. Raftery (2023). Model-Based Clustering, Classification, and Density Estimation Using mclust in R. Chapman & Hall.

Examples

Run this code

set.seed(1)
d <- gen.fuzzy(n = 300,
               dataset = "spirals",
               noise = 0.18)

plot.fuzzy(d) # visualize data generating process

adj <- make.adjacency(data = d$X,
                      method = "vw",
                      isLocWeighted = TRUE,
                      isModWeighted = FALSE,
                      isSparse = FALSE,
                      ModMethod = NULL,
                      scale = FALSE,
                      sig = 1,
                      radius = NULL,
                      cv.method = "cv.ls") # vwla-id from paper

spectRes <- fuzzy.spectral.clustering(W = adj,
                                      k = 3,
                                      m = 1.5,
                                      method = "CM",
                                      nstart = 50,
                                      max.iter = 1000)


head(spectRes$u) # first 6 rows of U

plotDf <- list(
  X = d$X,
  y = factor(spectRes$cluster),
  U = spectRes$u,
  k = 3
)

plot.fuzzy(plotDf) # visualize results

clustering.accuracy(d$y, spectRes$cluster) # compare results

Run the code above in your browser using DataLab