Learn R Programming

fdars (version 0.3.3)

cluster.fcm: Fuzzy C-Means Clustering for Functional Data

Description

Performs fuzzy c-means clustering on functional data, where each curve has a membership degree to each cluster rather than a hard assignment.

Usage

cluster.fcm(fdataobj, ncl, m = 2, max.iter = 100, tol = 1e-06, seed = NULL)

Value

A list of class 'fuzzycmeans.fd' with components:

membership

Matrix of membership degrees (n x ncl). Each row sums to 1.

cluster

Hard cluster assignments (argmax of membership).

centers

An fdata object containing the cluster centers.

objective

Final value of the objective function.

fdataobj

The input functional data object.

Arguments

fdataobj

An object of class 'fdata'.

ncl

Number of clusters.

m

Fuzziness parameter (default 2). Must be > 1. Higher values give softer cluster boundaries.

max.iter

Maximum number of iterations (default 100).

tol

Convergence tolerance (default 1e-6).

seed

Optional random seed for reproducibility.

Details

Fuzzy c-means minimizes the objective function: $$J = \sum_{i=1}^n \sum_{c=1}^k u_{ic}^m ||X_i - v_c||^2$$ where u_ic is the membership of curve i in cluster c, v_c is the cluster center, and m is the fuzziness parameter.

The membership degrees are updated as: $$u_{ic} = 1 / \sum_{j=1}^k (d_{ic}/d_{ij})^{2/(m-1)}$$

When m approaches 1, FCM becomes equivalent to hard k-means. As m increases, the clusters become softer (more overlap). m = 2 is the most common choice.

See Also

cluster.kmeans for hard clustering

Examples

Run this code
# Create functional data with THREE groups - one genuinely overlapping
set.seed(42)
t <- seq(0, 1, length.out = 50)
n <- 45
X <- matrix(0, n, 50)

# Group 1: Sine waves centered at 0
for (i in 1:15) X[i, ] <- sin(2*pi*t) + rnorm(50, sd = 0.2)
# Group 2: Sine waves centered at 1.5 (clearly separated from group 1)
for (i in 16:30) X[i, ] <- sin(2*pi*t) + 1.5 + rnorm(50, sd = 0.2)
# Group 3: Between groups 1 and 2 (true overlap - ambiguous membership)
for (i in 31:45) X[i, ] <- sin(2*pi*t) + 0.75 + rnorm(50, sd = 0.3)

fd <- fdata(X, argvals = t)

# Fuzzy clustering reveals the overlap
fcm <- cluster.fcm(fd, ncl = 3, seed = 123)

# Curves in group 3 (31-45) have split membership - this is the key benefit!
cat("Membership for curves 31-35 (overlap region):\n")
print(round(fcm$membership[31:35, ], 2))

# Compare to hard clustering which forces a decision
km <- cluster.kmeans(fd, ncl = 3, seed = 123)
cat("\nHard vs Fuzzy assignment for curve 35:\n")
cat("K-means cluster:", km$cluster[35], "\n")
cat("FCM memberships:", round(fcm$membership[35, ], 2), "\n")

Run the code above in your browser using DataLab