Learn R Programming

⚠️There's a newer version (0.1.4) of this package.Take me there.

Tools for Cluster Analysis

T4cluster is an R package designed as a computational toolkit with comprehensive coverage in relevant topics around the study of cluster analysis. It contains several classes of algorithms for

  • Clustering with Vector-Valued Data
  • Clustering with Functional Data
  • Subspace Clustering
  • Measures : Compare Two Clusterings
  • Learning with Multiple Clusterings

and other utility functions for further use. If you request additional functionalities or have suggestions, please contact maintainer.

Installation

You can install the released version of T4cluster from CRAN with:

install.packages("T4cluster")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("kyoustat/T4cluster")

Minimal Example : Clustering SMILEY Data

T4cluster offers a variety of clustering algorithms in common interface. In this example, we show a basic pipeline with smiley dataset, which can be generated as follows;

# load the library
library(T4cluster)

# generate the data
smiley = T4cluster::gensmiley(n=200)
data   = smiley$data
label  = smiley$label

# visualize
plot(data, pch=19, col=label, xlab="", ylab="", main="SMILEY Data")

where each component of the face is considered as one cluster - the data has 4 clusters. Here, we compare 4 different methods; (1) k-means (kmeans), (2) k-means++ (kmeanspp), (3) gaussian mixture model (gmm), and (4) spectral clustering with normalized cuts (scNJW).

# run algorithms
run1 = T4cluster::kmeans(data, k=4)
run2 = T4cluster::kmeanspp(data, k=4)
run3 = T4cluster::gmm(data, k=4)
run4 = T4cluster::scNJW(data, k=4, sigma = 0.1)

# visualize
par(mfrow=c(2,2))
plot(data, pch=19, xlab="", ylab="", col=run1$cluster, main="k-means")
plot(data, pch=19, xlab="", ylab="", col=run2$cluster, main="k-means++")
plot(data, pch=19, xlab="", ylab="", col=run3$cluster, main="gmm")
plot(data, pch=19, xlab="", ylab="", col=run4$cluster, main="scNJW")

Copy Link

Version

Install

install.packages('T4cluster')

Monthly Downloads

434

Version

0.1.1

License

MIT + file LICENSE

Maintainer

Kisung You

Last Published

February 11th, 2021

Functions in T4cluster (0.1.1)

SSQP

Subspace Segmentation via Quadratic Programming
compare.rand

(+) Rand Index
LRR

Low-Rank Representation
LRSC

Low-Rank Subspace Clustering
LSR

Least Squares Regression
MSM

Bayesian Mixture of Subspaces of Different Dimensions
EKSS

Ensembles of K-Subspaces
gen3S

Generate from Three 5-dimensional Subspaces in 200-dimensional space.
dpmeans

DP-Means Clustering
SSC

Sparse Subspace Clustering
genLP

Generate Line and Plane Example with Fixed Number of Components
funkmeans03A

Functional K-Means Clustering by Abraham et al. (2003)
funhclust

Functional Hierarchical Clustering
kmeans

K-Means Clustering
sc05Z

Spectral Clustering by Zelnik-Manor and Perona (2005)
kmeans18B

K-Means Clustering with Lightweight Coreset
gensmiley

Generate SMILEY Data
gmm

Finite Gaussian Mixture Model
sc12L

Spectral Clustering by Li and Guo (2012)
scNJW

Spectral Clustering by Ng, Jordan, and Weiss (2002)
kmeanspp

K-Means++ Clustering
pcm

Compute Pairwise Co-occurrence Matrix
gmm11R

Regularized GMM by Ruan et al. (2011)
compare.adjrand

(+) Adjusted Rand Index
psm

Compute Posterior Similarity Matrix
predict.MSM

S3 method to predict class label of new data with 'MSM' object
gmm16G

Weighted GMM by Gebru et al. (2016)
sc10Z

Spectral Clustering by Zhang et al. (2010)
sc11Y

Spectral Clustering by Yang et al. (2011)
scUL

Spectral Clustering with Unnormalized Laplacian
scSM

Spectral Clustering by Shi and Malik (2000)
sc09G

Spectral Clustering by Gu and Wang (2009)