seriation v1.2-2

0

Monthly downloads

0th

Percentile

by Michael Hahsler

Infrastructure for Ordering Objects Using Seriation

Infrastructure for seriation with an implementation of several seriation/sequencing techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).

Readme

seriation - Infrastructure for Ordering Objects Using Seriation - R package

CRAN version CRAN RStudio mirror downloads Travis-CI Build Status AppVeyor Build Status

This package provides the infrastructure for seriation with an implementation of several seriation/sequencing techniques to reorder matrices, dissimilarity matrices, and dendrograms (see below for a full list). Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).

Installation

Stable CRAN version: install from within R with

install.packages("seriation")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

install_git("mhahsler/seriation")

Usage

Load library, read data and calculate distances. Then use default seriation.

library(seriation)
data("iris")
x <- as.matrix(iris[-5])
x <- x[sample(1:nrow(x)),]

d <- dist(x)
order <- seriate(d)
order
object of class ‘ser_permutation’, ‘list’
contains permutation vectors for 1-mode data

  vector length seriation method
1           150             ARSA

Compare quality.

rbind(
 random = criterion(d),
 reordered = criterion(d, order)
)
          AR_events AR_deviations       RGAR Gradient_raw Gradient_weighted Path_length
random       550620    948833.712 0.49938328          741         -1759.954   392.77766
reordered     54846      9426.094 0.04974243       992214       1772123.418    83.95758
            Inertia Least_squares       ME Moore_stress Neumann_stress     2SUM      LS
random    214602194      78852819 291618.0    927570.00     461133.357 29954845 5669489
reordered 356945979      76487641 402332.1     13593.32       5274.093 17810802 4486900

Available Seriation Methods

For dissimilarity data:

  • Branch-and-bound to minimize the unweighted/weighted column gradient
  • DendSer - Dendrogram seriation heuristic to optimize various criteria
  • GA - Genetic algorithm with warm start to optimize various criteria
  • HC - Hierarchical clustering (single link, avg. link, complete link)
  • GW - Hierarchical clustering reordered by Gruvaeus and Wainer heuristic
  • OLO - Hierarchical clustering with optimal leaf ordering
  • Identity permutation
  • MDS - Multidimensional scaling (metric, non-metric, angle)
  • ARSA - Simulated annealing (linear seriation)
  • TSP - Traveling sales person solver to minimize Hamiltonian path length
  • R2E - Rank-two ellipse seriation
  • Random permutation
  • Spectral seriation (unnormalized, normalized)
  • SPIN - Sorting points into neighborhoods (neighborhood algorithm, side-to-site algorithm)
  • VAT - Visual assessment of clustering tendency ordering
  • QAP - Quadratic assignment problem heuristic (2-SUM, linear seriation, inertia, banded anti-Robinson form)

For matrices:

  • BEA - Bond Energy Algorithm to maximize the measure of effectiveness (ME)
  • Identity permutation
  • PCA - First principal component or angle on the projection on the first two principal components
  • Random permutation
  • TSP - Traveling sales person solver to maximize ME

References

Functions in seriation

Name Description
SupremeCourt Voting Patterns in the Second Rehnquist U.S. Supreme Court
Townships Bertin's Characteristics of Townships
Zoo Zoo Data Set
bertinplot Plot a Bertin Matrix
Chameleon 2D Data Sets used for the CHAMELEON Clustering Algorithm
Irish Irish Referendum Data Set
VAT Visual Analysis for Cluster Tendency Assessment (VAT/iVAT)
Wood Gene Expression Data for Wood Formation in Poplar Trees
Munsingen Hodson's Munsingen Data Set
Psych24 Results of 24 Psychological Test for 8th Grade Students
dissimilarity Dissimilarities and Correlations Between Seriation Orders
dissplot Dissimilarity Plot
color_palettes Different Useful Color Palettes
criterion Criterion for a Loss/Merit Function for Data Given a Permutation
permutation Class ser_permutation -- A Collection of
permutation_matrix Conversion Between Permutation Vector and Permutation Matrix
Robinson Create and Recognize Robinson and Pre-Robinson Matrices
pimage Permutation Image Plot
register_DendSer Register Seriation Methods from Package DendSer
register_GA Register a Genetic Algorithm Seriation Method
reorder.hclust Reorder Dendrograms using Optimal Leaf Ordering
seriate Seriate Dissimilarity Matrices, Matrices or Arrays
get_order Extracting Order Information from a Permutation Object
hmap Plot Heat Map Reordered Using Seriation
seriation_methods Registry for Seriation Methods
uniscale Unidimensional Scaling from Seriation Results
seriation_data Create Simulated Data for Seriation Evaluation
permutation_vector Class ser_permutation_vector --
permute Permute the Order in Various Objects
criterion_methods Registry for Criterion Methods
No Results!

Last month downloads

Details

Type Package
Date 2017-05-08
Classification/ACM G.1.6, G.2.1, G.4
URL http://lyle.smu.edu/IDA/seriation
BugReports https://github.com/mhahsler/seriation
License GPL-3
Copyright The code in src/bea.f is Copyright (C) 1991 F. Murtagh; src/bbwrcg.f, src/arsa.f and src/bburcg.f are Copyright (C) 2005 M. Brusco, H.F. Koehn, and S. Stahl. All other code is Copyright (C) Michael Hahsler, Christian Buchta, and Kurt Hornik.
NeedsCompilation yes
Packaged 2017-05-08 15:21:40 UTC; hahsler
Repository CRAN
Date/Publication 2017-05-09 06:56:01 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/seriation)](http://www.rdocumentation.org/packages/seriation)