Learn R Programming

⚠️There's a newer version (1.5.8) of this package.Take me there.

seriation - Infrastructure for Ordering Objects Using Seriation - R package

This package provides the infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms (see below for a full list). Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).

Installation

Stable CRAN version: install from within R with

install.packages("seriation")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

library("devtools")
install_github("mhahsler/seriation")

Usage

Load library, read data and calculate distances. Then use default seriation.

library(seriation)
data("iris")
x <- as.matrix(iris[-5])
x <- x[sample(1:nrow(x)),]

d <- dist(x)
order <- seriate(d)
order
object of class ‘ser_permutation’, ‘list’
contains permutation vectors for 1-mode data

  vector length seriation method
1           150             ARSA

Compare quality.

rbind(
 random = criterion(d),
 reordered = criterion(d, order)
)
          AR_events AR_deviations       RGAR Gradient_raw Gradient_weighted Path_length
random       550620    948833.712 0.49938328          741         -1759.954   392.77766
reordered     54846      9426.094 0.04974243       992214       1772123.418    83.95758
            Inertia Least_squares       ME Moore_stress Neumann_stress     2SUM      LS
random    214602194      78852819 291618.0    927570.00     461133.357 29954845 5669489
reordered 356945979      76487641 402332.1     13593.32       5274.093 17810802 4486900

Available Seriation Method

The following methods are available for dissimilarity data:

  • ARSA - Simulated annealing (linear seriation)
  • Branch-and-bound to minimize the unweighted/weighted column gradient
  • DendSer - Dendrogram seriation heuristic to optimize various criteria
  • GA - Genetic algorithm with warm start to optimize various criteria
  • GW - Hierarchical clustering reordered by Gruvaeus and Wainer heuristic
  • HC - Hierarchical clustering (single link, avg. link, complete link)
  • Identity permutation
  • MDS - Multidimensional scaling (metric, non-metric, angle)
  • OLO - Hierarchical clustering with optimal leaf ordering
  • OPTICS - Ordering points to identify the clustering structure.
  • QAP - Quadratic assignment problem heuristic (2-SUM, linear seriation, inertia, banded anti-Robinson form)
  • R2E - Rank-two ellipse seriation
  • Random permutation
  • Spectral seriation (unnormalized, normalized)
  • SPIN - Sorting points into neighborhoods (neighborhood algorithm, side-to-site algorithm)
  • TSP - Traveling sales person solver to minimize the Hamiltonian path length
  • TSNE - Order of the 1D t-distributed stochastic neighbor embedding (t-SNE)
  • UMAP - Order of the 1D embedding produced by uniform manifold approximation and projection
  • VAT - Order of the visual assessment of clustering tendency ordering

A detailed comparison of the methods is available in the paper An experimental comparison of seriation methods for one-mode two-way data. (read the preprint).

The following methods are available for matrices:

  • BEA - Bond Energy Algorithm to maximize the measure of effectiveness (ME)
  • Identity permutation
  • PCA - First principal component or angle on the projection on the first two principal components
  • Random permutation
  • TSP - Traveling sales person solver to maximize ME

References

Copy Link

Version

Install

install.packages('seriation')

Monthly Downloads

30,876

Version

1.3.4

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Hahsler

Last Published

March 17th, 2022

Functions in seriation (1.3.4)

Townships

Bertin's Characteristics of Townships
Munsingen

Hodson's Munsingen Data Set
Wood

Gene Expression Data for Wood Formation in Poplar Trees
Zoo

Zoo Data Set
create_lines_data

Create Simulated Data for Seriation Evaluation
bertinplot

Plot a Bertin Matrix
get_order

Extracting Order Information from a Permutation Object
hmap

Plot Heat Map Reordered Using Seriation
is.robinson

Create and Recognize Robinson and Pre-Robinson Matrices
pimage

Permutation Image Plot
palette

Different Useful Color Palettes
register_DendSer

Register Seriation Methods from Package DendSer
register_optics

Register Seriation Based on OPTICS
register_GA

Register a Genetic Algorithm Seriation Method
register_tsne

Register Seriation Based on 1D t-SNE
criterion

Criterion for a Loss/Merit Function for Data Given a Permutation
permute

Permute the Order in Various Objects
permutation_vector2matrix

Conversion Between Permutation Vector and Permutation Matrix
ser_permutation

Class ser_permutation -- A Collection of Permutation Vectors for Seriation
reorder.hclust

Reorder Dendrograms using Optimal Leaf Ordering
ser_permutation_vector

Class ser_permutation_vector -- A Single Permutation Vector for Seriation
ser_dist

Dissimilarities and Correlations Between Seriation Orders
dissplot

Dissimilarity Plot
register_umap

Register Seriation Based on 1D UMAP
uniscale

Unidimensional Scaling from Seriation Results
seriation-package

seriation: Infrastructure for Ordering Objects Using Seriation
seriate

Seriate Dissimilarity Matrices, Matrices or Arrays
registry_criterion

Registry for Criterion Methods
registry_seriate

Registry for Seriation Methods
SupremeCourt

Voting Patterns in the Second Rehnquist U.S. Supreme Court
Chameleon

2D Data Sets used for the CHAMELEON Clustering Algorithm
Irish

Irish Referendum Data Set
Psych24

Results of 24 Psychological Test for 8th Grade Students
LS

Neighborhood functions for Seriation Method SA
VAT

Visual Analysis for Cluster Tendency Assessment (VAT/iVAT)