seriation: Matrix seriation

Description

refine performs a partial bootstrap correspondance analysis seriation refinement.

seriate computes a permutation order for rows and/or columns.

permute rearranges a data matrix according to a permutation order.

Usage

refine(object, ...)
seriate(object, ...)
permute(object, order, ...)
# S4 method for CountMatrix
refine(object, cutoff, n = 1000, axes = c(1,
  2), ...)
# S4 method for CountMatrix
seriate(object, method = c("correspondance",
  "reciprocal"), EPPM = FALSE, margin = c(1, 2), stop = 100, ...)
# S4 method for IncidenceMatrix
seriate(object, method = c("correspondance",
  "reciprocal"), margin = c(1, 2), stop = 100, ...)
# S4 method for CountMatrix,PermutationOrder
permute(object, order)
# S4 method for IncidenceMatrix,PermutationOrder
permute(object, order)

Arguments

object

An \(m \times p\) data matrix.

...

Further arguments passed to other methods.

order

An object giving the permutation order for rows and columns.

cutoff

A function that takes a numeric vector as argument and returns a single numeric value (see details).

A non-negative integer giving the number of partial bootstrap replications (see details).

axes

A numeric vector giving the subscripts of the CA axes to use (see details).

method

A character string specifiying the method to be used. This must be one of "reciprocal", "correspondance" (see details). Any unambiguous substring can be given.

EPPM

A logical scalar: should the seriation be computed on EPPM instead of raw data?

margin

A numeric vector giving the subscripts which the rearrangement will be applied over. E.g., for a matrix 1 indicates rows, 2 indicates columns, c(1, 2) indicates rows then columns, c(2, 1) indicates columns then rows.

stop

A length-one numeric vector giving the stopping rule (i.e. maximum number of iterations) to avoid infinite loop.

Value

refine returns a '>BootCA object.

seriate returns a '>PermutationOrder object.

permute returns either a '>CountMatrix, '>FrequencyMatrix or '>IncidenceMatrix (the same as object).

Details

The matrix seriation problem in archaeology is based on three conditions and two assumptions, which Dunell (1970) summarizes as follows.

The homogeneity conditions state that all the groups included in a seriation must:

Be of comparable duration.
Belong to the same cultural tradition.
Come from the same local area.

The mathematical assumptions state that the distribution of any historical or temporal class:

Is continuous through time.
Exhibits the form of a unimodal curve.

Theses assumptions create a distributional model and ordering is accomplished by arranging the matrix so that the class distributions approximate the required pattern. The resulting order is infered to be chronological.

The following seriation methods are available:

correspondance: Correspondance analysis-based seriation. Correspondance analysis (CA) is an effective method for the seriation of archaeological assemblages. The order of the rows and columns is given by the coordinates along one dimension of the CA space, assumed to account for temporal variation. The direction of temporal change within the correspondance analysis space is arbitrary: additional information is needed to determine the actual order in time.
reciprocal: Reciprocal ranking (incidence data) or averaging (frequency data) seriation. These procedures iteratively rearrange rows and/or columns according to their weighted rank in the data matrix until convergence. Note that this procedure could enter into an infinite loop. If no convergence is reached before the maximum number of iterations, it stops with a warning.

refine allows to identify samples that are subject to sampling error or samples that have underlying structural relationships and might be influencing the ordering along the CA space. This relies on a partial bootstrap approach to CA-based seriation where each sample is replicated n times. The maximum dimension length of the convex hull around the sample point cloud allows to remove samples for a given cutoff value.

According to Peebles and Schachner (2012), "[this] point removal procedure [results in] a reduced dataset where the position of individuals within the CA are highly stable and which produces an ordering consistend with the assumptions of frequency seriation."

refine returns the subscript of samples to be kept (i.e. samples with maximum dimension length of the convex hull smaller than the cutoff value).

References

Desachy, B. (2004). Le s<U+00E9>riographe EPPM: un outil informatis<U+00E9> de s<U+00E9>riation graphique pour tableaux de comptages. Revue arch<U+00E9>ologique de Picardie, 3(1), 39-56. DOI: 10.3406/pica.2004.2396.

Dunnell, R. C. (1970). Seriation Method and Its Evaluation. American Antiquity, 35(03), 305-319. DOI: 10.2307/278341.

Ihm, P. (2005). A Contribution to the History of Seriation in Archaeology. In C. Weihs & W. Gaul (Eds.), Classification: The Ubiquitous Challenge (p. 307-316). Berlin Heidelberg: Springer. DOI: 10.1007/3-540-28084-7_34.

Peeples, M. A., & Schachner, G. (2012). Refining correspondence analysis-based ceramic seriation of regional data sets. Journal of Archaeological Science, 39(8), 2818-2827. DOI: 10.1016/j.jas.2012.04.040.

Examples

Run this code

# NOT RUN {
# Refine matrix seriation (this is a long running example)
# Reproduces Peeples and Schachner 2012 results
count <- as(zuni, "CountMatrix")

## Samples with convex hull maximum dimension length greater than the cutoff
## value will be marked for removal.
## Define cutoff as one standard deviation above the mean
fun <- function(x) { mean(x) + sd(x) }

## Get indices of samples to be kept
## Warning: this may take a few seconds!
refined <- refine(count, cutoff = fun)
refined[["keep"]]

# Matrix seriation
# Reproduces Desachy 2004 results
## Coerce dataset to abundance matrix
count <- as(compiegne, "CountMatrix")

## Plot new matrix
plotBar(count, EPPM = TRUE)

## Get seriation order for columns on EPPM using the reciprocal averaging method
## Expected column order: N, A, C, K, P, L, B, E, I, M, D, G, O, J, F, H
indices <- seriate(count, method = "reciprocal", EPPM = TRUE, margin = 2)

## Permute columns
new <- permute(count, indices)

## Plot new matrix
plotBar(new, EPPM = TRUE)
# }

Run the code above in your browser using DataLab