seriation v1.2-9
Monthly downloads
Infrastructure for Ordering Objects Using Seriation
Infrastructure for ordering objects with an implementation of several
seriation/sequencing/ordination techniques to reorder matrices, dissimilarity
matrices, and dendrograms. Also provides (optimally) reordered heatmaps,
color images and clustering visualizations like dissimilarity plots, and
visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.
Readme
seriation - Infrastructure for Ordering Objects Using Seriation - R package
This package provides the infrastructure for ordering objects with an implementation of several seriation)/sequencing/ordination) techniques to reorder matrices, dissimilarity matrices, and dendrograms (see below for a full list). Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).
Installation
Stable CRAN version: install from within R with
install.packages("seriation")
Current development version: Download package from AppVeyor or install from GitHub (needs devtools).
library("devtools")
install_github("mhahsler/seriation")
Usage
Load library, read data and calculate distances. Then use default seriation.
library(seriation)
data("iris")
x <- as.matrix(iris[-5])
x <- x[sample(1:nrow(x)),]
d <- dist(x)
order <- seriate(d)
order
object of class ‘ser_permutation’, ‘list’
contains permutation vectors for 1-mode data
vector length seriation method
1 150 ARSA
Compare quality.
rbind(
random = criterion(d),
reordered = criterion(d, order)
)
AR_events AR_deviations RGAR Gradient_raw Gradient_weighted Path_length
random 550620 948833.712 0.49938328 741 -1759.954 392.77766
reordered 54846 9426.094 0.04974243 992214 1772123.418 83.95758
Inertia Least_squares ME Moore_stress Neumann_stress 2SUM LS
random 214602194 78852819 291618.0 927570.00 461133.357 29954845 5669489
reordered 356945979 76487641 402332.1 13593.32 5274.093 17810802 4486900
Available Seriation Method
The following methods are available for dissimilarity data:
- Branch-and-bound to minimize the unweighted/weighted column gradient
- DendSer - Dendrogram seriation heuristic to optimize various criteria
- GA - Genetic algorithm with warm start to optimize various criteria
- HC - Hierarchical clustering (single link, avg. link, complete link)
- GW - Hierarchical clustering reordered by Gruvaeus and Wainer heuristic
- OLO - Hierarchical clustering with optimal leaf ordering
- Identity permutation
- MDS - Multidimensional scaling (metric, non-metric, angle)
- ARSA - Simulated annealing (linear seriation)
- TSP - Traveling sales person solver to minimize Hamiltonian path length
- R2E - Rank-two ellipse seriation
- Random permutation
- Spectral seriation (unnormalized, normalized)
- SPIN - Sorting points into neighborhoods (neighborhood algorithm, side-to-site algorithm)
- VAT - Visual assessment of clustering tendency ordering
- QAP - Quadratic assignment problem heuristic (2-SUM, linear seriation, inertia, banded anti-Robinson form)
A detailed comparison of the methods is available in the paper An experimental comparison of seriation methods for one-mode two-way data. (read preprint).
The following methods are available for matrices:
- BEA - Bond Energy Algorithm to maximize the measure of effectiveness (ME)
- Identity permutation
- PCA - First principal component or angle on the projection on the first two principal components
- Random permutation
- TSP - Traveling sales person solver to maximize ME
References
- Reference manual for package seriation
- Michael Hahsler, Kurt Hornik and Christian Buchta, Getting Things in Order: An Introduction to the R Package seriation, Journal of Statistical Software, 25(3), 2008.
- Michael Hahsler. An experimental comparison of seriation methods for one-mode two-way data. European Journal of Operational Research, 257:133-143, 2017. (read preprint)
- Seriation package vignette with complete examples.
Functions in seriation
| Name | Description | |
| Zoo | Zoo Data Set | |
| Wood | Gene Expression Data for Wood Formation in Poplar Trees | |
| Irish | Irish Referendum Data Set | |
| SupremeCourt | Voting Patterns in the Second Rehnquist U.S. Supreme Court | |
| Townships | Bertin's Characteristics of Townships | |
| seriation_data | Create Simulated Data for Seriation Evaluation | |
| criterion_methods | Registry for Criterion Methods | |
| permutation | Class ser_permutation -- A Collection of Permutation Vectors for Seriation | |
| bertinplot | Plot a Bertin Matrix | |
| permutation_matrix | Conversion Between Permutation Vector and Permutation Matrix | |
| permutation_vector | Class ser_permutation_vector -- A Single Permutation Vector for Seriation | |
| Robinson | Create and Recognize Robinson and Pre-Robinson Matrices | |
| seriate | Seriate Dissimilarity Matrices, Matrices or Arrays | |
| permute | Permute the Order in Various Objects | |
| uniscale | Unidimensional Scaling from Seriation Results | |
| seriation_methods | Registry for Seriation Methods | |
| color_palettes | Different Useful Color Palettes | |
| criterion | Criterion for a Loss/Merit Function for Data Given a Permutation | |
| hmap | Plot Heat Map Reordered Using Seriation | |
| get_order | Extracting Order Information from a Permutation Object | |
| pimage | Permutation Image Plot | |
| register_DendSer | Register Seriation Methods from Package DendSer | |
| register_GA | Register a Genetic Algorithm Seriation Method | |
| reorder.hclust | Reorder Dendrograms using Optimal Leaf Ordering | |
| Chameleon | 2D Data Sets used for the CHAMELEON Clustering Algorithm | |
| Munsingen | Hodson's Munsingen Data Set | |
| Psych24 | Results of 24 Psychological Test for 8th Grade Students | |
| dissimilarity | Dissimilarities and Correlations Between Seriation Orders | |
| dissplot | Dissimilarity Plot | |
| VAT | Visual Analysis for Cluster Tendency Assessment (VAT/iVAT) | |
| No Results! | ||
Vignettes of seriation
| Name | ||
| classes.odg | ||
| classes.pdf | ||
| seriation.Rnw | ||
| seriation.bib | ||
| No Results! | ||
Last month downloads
Details
| Type | Package |
| Date | 2020-09-29 |
| Classification/ACM | G.1.6, G.2.1, G.4 |
| URL | https://github.com/mhahsler/seriation |
| BugReports | https://github.com/mhahsler/seriation/issues |
| License | GPL-3 |
| Copyright | The code in src/bea.f is Copyright (C) 1991 F. Murtagh; src/bbwrcg.f, src/arsa.f and src/bburcg.f are Copyright (C) 2005 M. Brusco, H.F. Koehn, and S. Stahl. All other code is Copyright (C) Michael Hahsler, Christian Buchta, and Kurt Hornik. |
| NeedsCompilation | yes |
| Packaged | 2020-09-30 15:42:50 UTC; hahsler |
| Repository | CRAN |
| Date/Publication | 2020-10-01 08:10:06 UTC |
| suggests | biclust , DendSer , GA , testthat |
| imports | cluster , colorspace , dendextend , gclus , gplots , grDevices , grid , MASS , qap , registry , stats , TSP |
| depends | R (>= 2.14.0) |
| Contributors | Christian Buchta, Kurt Hornik, Fionn Murtagh, Michael Brusco, Stephanie Stahl, Hans-Friedrich Koehn |
Include our badge in your README
[](http://www.rdocumentation.org/packages/seriation)