Learn R Programming

Tools for Cluster Analysis

T4cluster is an R package designed as a computational toolkit with comprehensive coverage in relevant topics around the study of cluster analysis. It contains several classes of algorithms for

  • Clustering with Vector-Valued Data
  • Clustering with Functional Data
  • Clustering with Empirical Distributions
  • Clustering on the Unit Hypersphere
  • Subspace Clustering
  • Measures : Compare Two Clusterings
  • Measures : Quality of a Clustering
  • Learning with Multiple Clusterings

and other utility functions for further use. If you request additional functionalities or have suggestions, please contact maintainer.

Installation

You can install the released version of T4cluster from CRAN with:

install.packages("T4cluster")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("kisungyou/T4cluster")

Minimal Example : Clustering SMILEY Data

T4cluster offers a variety of clustering algorithms in common interface. In this example, we show a basic pipeline with smiley dataset, which can be generated as follows;

# load the library
library(T4cluster)

# generate the data
smiley = T4cluster::genSMILEY(n=200)
data   = smiley$data
label  = smiley$label

# visualize
plot(data, pch=19, col=label, xlab="", ylab="", main="SMILEY Data")

where each component of the face is considered as one cluster - the data has 4 clusters. Here, we compare 4 different methods; (1) k-means (kmeans), (2) k-means++ (kmeanspp), (3) gaussian mixture model (gmm), and (4) spectral clustering with normalized cuts (scNJW).

# run algorithms
run1 = T4cluster::kmeans(data, k=4)
run2 = T4cluster::kmeanspp(data, k=4)
run3 = T4cluster::gmm(data, k=4)
run4 = T4cluster::scNJW(data, k=4, sigma = 0.1)

# visualize
par(mfrow=c(2,2))
plot(data, pch=19, xlab="", ylab="", col=run1$cluster, main="k-means")
plot(data, pch=19, xlab="", ylab="", col=run2$cluster, main="k-means++")
plot(data, pch=19, xlab="", ylab="", col=run3$cluster, main="gmm")
plot(data, pch=19, xlab="", ylab="", col=run4$cluster, main="scNJW")

Copy Link

Version

Install

install.packages('T4cluster')

Monthly Downloads

1,786

Version

0.1.4

License

MIT + file LICENSE

Maintainer

Kisung You

Last Published

September 22nd, 2025

Functions in T4cluster (0.1.4)

gen3S

Generate from Three 5-dimensional Subspaces in 200-dimensional space.
ephclust

Hierarchical Agglomerative Clustering for Empirical Distributions
dpmeans

DP-Means Clustering
genDONUTS

Generate Nested Donuts
genLP

Generate Line and Plane Example with Fixed Number of Components
funhclust

Functional Hierarchical Clustering
genSMILEY

Generate SMILEY Data
funkmeans03A

Functional K-Means Clustering by Abraham et al. (2003)
gbphate

Generalized Bayesian Clustering with PHATE Geometry
gmm

Finite Gaussian Mixture Model
gmm03F

Ensemble of Gaussian Mixtures with Random Projection
gmm16G

Weighted GMM by Gebru et al. (2016)
pcm

Compute Pairwise Co-occurrence Matrix
kmeans18B

K-Means Clustering with Lightweight Coreset
household

Load 'household' data
predict.MSM

S3 method to predict class label of new data with 'MSM' object
kmeanspp

K-Means++ Clustering
kmeans

K-Means Clustering
psm

Compute Posterior Similarity Matrix
sc10Z

Spectral Clustering by Zhang et al. (2010)
sc09G

Spectral Clustering by Gu and Wang (2009)
sc11Y

Spectral Clustering by Yang et al. (2011)
quality.sil

(+) Silhouette Index
gskmeans

Geodesic Spherical K-Means
sc05Z

Spectral Clustering by Zelnik-Manor and Perona (2005)
quality.CH

(+) CH index
gmm11R

Regularized GMM by Ruan et al. (2011)
scNJW

Spectral Clustering by Ng, Jordan, and Weiss (2002)
scSM

Spectral Clustering by Shi and Malik (2000)
scUL

Spectral Clustering with Unnormalized Laplacian
sc12L

Spectral Clustering by Li and Guo (2012)
spkmeans

Spherical K-Means Clustering
SSQP

Subspace Segmentation via Quadratic Programming
compare.adjrand

(+) Adjusted Rand Index
ccphate

Consensus Clustering with PHATE Geometry
EKSS

Ensembles of K-Subspaces
LRSC

Low-Rank Subspace Clustering
compare.rand

(+) Rand Index
MSM

Bayesian Mixture of Subspaces of Different Dimensions
SSC

Sparse Subspace Clustering
LSR

Least Squares Regression
LRR

Low-Rank Representation