Learn R Programming

bfcluster (version 1.0.0)

bf_R2: R² for Cluster Solutions after Buttler & Fickel (1995)

Description

Computes the proportion of explained distance variation (R²) for a given clustering solution using a distance matrix derived from the Buttler-Fickel distance. The statistic reflects how well the clustering partitions the total pairwise distance structure.

Usage

bf_R2(D, cluster)

Value

A numeric value between 0 and 1 indicating the proportion of explained distance variation. Higher values represent better cluster fit.

Arguments

D

A distance object of class dist, usually computed via buttler_fickel_dist().

cluster

An integer or factor vector of cluster assignments, typically obtained from cutree() or another clustering method.

Details

The R² is defined as: $$R^2 = 1 - \frac{D_{\text{within}}}{D_{\text{total}}}$$ where \(D_{\text{total}}\) is the sum of all pairwise distances and \(D_{\text{within}}\) is the sum of distances within clusters.

Examples

Run this code
df <- data.frame(
  sex    = factor(c("m","f","m","f")),
  height = c(180, 165, 170, 159),
  age    = c(25, 32, 29, 28)
)

types <- c("nominal", "metric", "metric")

D <- buttler_fickel_dist(df, types)
hc <- hclust(D)
cl <- cutree(hc, k = 2)

bf_R2(D, cl)

Run the code above in your browser using DataLab