commsim: Create a Object for Null Model Algorithms

Description

The commsim function can be used to feed Null Model algorithms into nullmodel analysis. The make.commsim function returns various predefined algorithm types (see Details). These functions represent low level interface for community null model infrastructure in vegan with the intent of extensibility, and less emphasis on direct use by users.

Usage

commsim(method, fun, binary, isSeq, mode)
make.commsim(method)
## S3 method for class 'commsim':
print(x, ...)

Arguments

method

Character, name of the algorithm.

fun

A function. For possible formal arguments of this function see Details.

binary

Logical, if the algorithm applies to presence-absence or count matrices.

isSeq

Logical, if the algorithm is sequential (needs burnin) or not.

mode

Character, storage mode of the community matrix, either "integer" or "double".

An object of class commsim.

...

Additional arguments.

Value

An object of class commsim with elements corresponding to the arguments (method, binary, isSeq, mode, fun).
If the input of make.comsimm is a commsim object, it is returned without further evaluation. If this is not the case, the character method argument is matched against predefined algorithm names. An error message is issued if none such is found. If the method argument is missing, the function returns names of all currently available null model algorithms as a character vector.

encoding

UTF-8

code

oecosimu

Binary null models

All binary null models retain fill: number of absences or conversely the number of absences. The classic models may also column (species) frequencies (c0) or row frequencies or species richness of each site (r0) and take into account commonness and rarity of species (r1, r2). Algorithms swap, tswap, quasiswap and backtracking preserve both row and column frequencies. Two first of these are sequential but the two latter are non-sequential and produce independent matrices. Basic algorithms are reviewed by Wright et al. (1998).

"r00":

{non-sequential algorithm for binary matrices that only maintains the number of presences (fill).}

"r0", "r0_old":{non-sequential algorithm for binary matrices that maintains the site (row) frequencies. Methods "r0" and "r0_old" implement the same method, but use different random number sequences; use "r0_old" if you want to reproduce results in vegan 2.0-0 or older using commsimulator (now deprecated).}

"r1":{non-sequential algorithm for binary matrices that maintains the site (row) frequencies, but uses column marginal frequencies as probabilities of selecting species.}

"r2":{non-sequential algorithm for binary matrices that maintains the site (row) frequencies, and uses squared column sums as as probabilities of selecting species.} "c0":{non-sequential algorithm for binary matrices that maintains species frequencies (Jonsson 2001). } "swap":{sequential algorithm for binary matrices that changes the matrix structure, but does not influence marginal sums (Gotelli & Entsminger 2003). This inspects $2 \times 2$ submatrices so long that a swap can be done.} "tswap":{sequential algorithm for binary matrices. Same as the "swap" algorithm, but it tries a fixed number of times and performs zero to many swaps at one step (according the thin argument in later call). This approach was suggested by Miklós{Miklos} & Podani (2004) because they found that ordinary swap may lead to biased sequences, since some columns or rows may be more easily swapped.}

"quasiswap":{non-sequential algorithm for binary matrices that implements a method where matrix is first filled honouring row and column totals, but with integers that may be larger than one. Then the method inspects random $2 \times 2$ matrices and performs a quasiswap on them. Quasiswap is similar to ordinary swap, but it can reduce numbers above one to ones maintaining marginal totals (Miklós{Miklos} & Podani 2004). This is the recommended algorithm if you want to retain both species and row frequencies.}

"backtracking":{non-sequential algorithm for binary matrices that implements a filling method with constraints both for row and column frequencies (Gotelli & Entsminger 2001). The matrix is first filled randomly using row and column frequencies as probabilities. Typically row and column sums are reached before all incidences are filled in. After that begins "backtracking", where some of the points are removed, and then filling is started again, and this backtracking is done so may times that all incidences will be filled into matrix. The function may be very slow for some matrices.}

Quantitative Models for Counts with Fixed Marginal Sums

These models shuffle individuals of counts but keep marginal sums fixed, but marginal frequencies are not preserved. Algorithm r2dtable uses standard Rfunction r2dtable also used for simulated $P$-values in chisq.test. Algorithm quasiswap_count uses the same, but retains the original fill. Typically this means increasing numbers of zero cells and the result is zero-inflated with respect to r2dtable.

"r2dtable":

{non-sequential algorithm for count matrices. This algorithm keeps matrix sum and row/column sums constant. Based on r2dtable.}

"quasiswap_count":{non-sequential algorithm for count matrices. This algorithm is similar as Carsten Dormann's swap.web function in the package bipartite. First, a random matrix is generated by the r2dtable function retaining row and column sums. Then the original matrix fill is reconstructed by sequential steps to increase or decrease matrix fill in the random matrix. These steps are based on swapping $2 \times 2$ submatrices (see "swap_count" algorithm for details) to maintain row and column totals. }

Quantitative Swap Models

Quantitative swap models are similar to binary swap, but they swap the largest permissible value. The models in this section all maintain the fill and perform a quantitative swap only if this can be done without changing the fill. Single step of swap often changes the matrix very little. In particular, if cell counts are variable, high values change very slowly. Checking the chain stability and independence is even more crucial than in binary swap, and very strong thinning is often needed. These models should never be used without inspecting their properties for the current data.

"swap_count":

{sequential algorithm for count matrices. This algorithm find $2 \times 2$ submatrices that can be swapped leaving column and row totals and fill unchanged. The algorithm finds the largest value in the submatrix that can be swapped ($d$). Swap means that the values in diagonal or antidiagonal positions are decreased by $d$, while remaining cells are increased by $d$. A swap is made only if fill does not change. }

"abuswap_r":{sequential algorithm for count or nonnegative real valued matrices with fixed row frequencies (see also permatswap). The algorithm is similar to swap_count, but uses different swap value for each row of the $2 \times 2$ submatrix. Each step changes the the corresponding column sums, but honours matrix fill, row sums, and row/column frequencies (Hardy 2008; randomization scheme 2x).}

"abuswap_c":{sequential algorithm for count or nonnegative real valued matrices with fixed column frequencies (see also permatswap). The algorithm is similar as the previous one, but operates on columns. 2 x 2 submatrices. Each step changes the the corresponding row sums, but honours matrix fill, column sums, and row/column frequencies (Hardy 2008; randomization scheme 3x).}

Quantitative Swap and Shuffle Models

Quantitative Swap and Shuffle methods (swsh methods) preserve fill and column and row frequencies, and also either row or column sums. The methods first perform a binary quasiswap and then shuffle original quantitative data to non-zero cells. The samp methods shuffle original non-zero cell values and can be used also with non-integer data. The both methods redistribute individuals randomly among non-zero cells and can only be used with integer data. The shuffling is either free over the whole matrix, or within rows (r methods) or within columns (c methods). Shuffling within a row preserves row sums, and shuffling within a column preserves column sums.

"swsh_samp":

{non-sequential algorithm for quantitative data (either integer counts or non-integer values). Original non-zero values values are shuffled.}

"swsh_both":{non-sequential algorithm for count data. Individuals are shuffled freely over non-zero cells.}

"swsh_samp_r":{non-sequential algorithm for quantitative data. Non-zero values (samples) are shuffled separately for each row.}

"swsh_samp_c":{non-sequential algorithm for quantitative data. Non-zero values (samples) are shuffled separately for each column.}

"swsh_both_r":{non-sequential algorithm for count matrices. Individuals are shuffled freely for non-zero values within each row.}

"swsh_both_c":{non-sequential algorithm for count matrices. Individuals are shuffled freely for non-zero values with each column.}

Quantitative Shuffle Methods

Quantitative shuffle methods are generalizations of binary models r00, r0 and c0. The _ind methods shuffle individuals so that the grand sum, row sum or column sums are similar as in the observed matrix. These methods are similar as r2dtable but with still slacker constraints on marginal sums. The _samp and _both methods first perform the correspongind binary model with similar restriction on marginal frequencies, and then distribute quantitative values over non-zero cells. The _samp models shuffle original cell values and can therefore handle also non-count real values. The _both models shuffle individuals among non-zero values. The shuffling is over the whole matrix in r00_, and within row in r0_ and within column in c0_ in all cases.

"r00_ind":

{non-sequential algorithm for count matrices. This algorithm keeps total sum constant, individuals are shuffled among cells of the matrix.}

"r0_ind":{non-sequential algorithm for count matrices. This algorithm keeps row sums constant, individuals are shuffled among cells of each row of the matrix.}

"c0_ind":{non-sequential algorithm for count matrices. This algorithm keeps column sums constant, individuals are shuffled among cells of each column of the matrix.}

"r00_samp":{non-sequential algorithm for count or nonnegative real valued (mode = "double") matrices. This algorithm keeps total sum constant, cells of the matrix are shuffled.}

"r0_samp":{non-sequential algorithm for count or nonnegative real valued (mode = "double") matrices. This algorithm keeps row sums constant, cells within each row are shuffled.}

"c0_samp":{non-sequential algorithm for count or nonnegative real valued (mode = "double") matrices. This algorithm keeps column sums constant, cells within each column are shuffled.}

"r00_both":{non-sequential algorithm for count matrices. This algorithm keeps total sum constant, cells and individuals among cells of the matrix are shuffled.}

"r0_both":{non-sequential algorithm for count matrices. This algorithm keeps total sum constant, cells and individuals among cells of each row are shuffled.}

"c0_both":{non-sequential algorithm for count matrices. This algorithm keeps total sum constant, cells and individuals among cells of each column are shuffled.}

Details

The function fun must return an array of dim(nr, nc, n), and must take some of the following arguments:

x:

{input matrix,} n:{number of permuted matrices in output,} nr:{number of rows,} nc:{number of columns,} rs:{vector of row sums,} cs:{vector of column sums,} rf:{vector of row frequencies (non-zero cells),} cf:{vector of column frequencies (non-zero cells),} s:{total sum of x,} fill:{matrix fill (non-zero cells),} thin:{thinning value for sequential algorithms,} ...:{additional arguments.}

References

Gotelli, N.J. & Entsminger, N.J. (2001). Swap and fill algorithms in null model analysis: rethinking the knight's tour. Oecologia 129, 281--291.

Gotelli, N.J. & Entsminger, N.J. (2003). Swap algorithms in null model analysis. Ecology 84, 532--535.

Hardy, O. J. (2008) Testing the spatial phylogenetic structure of local communities: statistical performances of different null models and test statistics on a locally neutral community. Journal of Ecology 96, 914--926.

Jonsson, B.G. (2001) A null model for randomization tests of nestedness in species assemblages. Oecologia 127, 309--313.

Miklós{Miklos}, I. & Podani, J. (2004). Randomization of presence-absence matrices: comments and new algorithms. Ecology 85, 86--92.

Patefield, W. M. (1981) Algorithm AS159. An efficient method of generating r x c tables with given row and column totals. Applied Statistics 30, 91--97.

Wright, D.H., Patterson, B.D., Mikkelson, G.M., Cutler, A. & Atmar, W. (1998). A comparative analysis of nested subset patterns of species composition. Oecologia 113, 1--20.

Examples

Run this code

## write the r00 algorithm
f <- function(x, n, ...) 
    array(replicate(n, sample(x)), c(dim(x), n))
(cs <- commsim("r00", fun=f, binary=TRUE, 
    isSeq=FALSE, mode="integer"))

## retrieving the sequential swap algorithm
(cs <- make.commsim("swap"))

## feeding a commsim object as argument
make.commsim(cs)

## structural constraints
diagfun <- function(x, y) {
    c(sum = sum(y) == sum(x),
        fill = sum(y > 0) == sum(x > 0),
        rowSums = all(rowSums(y) == rowSums(x)),
        colSums = all(colSums(y) == colSums(x)),
        rowFreq = all(rowSums(y > 0) == rowSums(x > 0)),
        colFreq = all(colSums(y > 0) == colSums(x > 0)))
}
evalfun <- function(meth, x, n) {
    m <- nullmodel(x, meth)
    y <- simulate(m, nsim=n)
    out <- rowMeans(sapply(1:dim(y)[3], 
        function(i) diagfun(attr(y, "data"), y[,,i])))
    z <- as.numeric(c(attr(y, "binary"), attr(y, "isSeq"),
        attr(y, "mode") == "double"))
    names(z) <- c("binary", "isSeq", "double")
    c(z, out)
}
x <- matrix(rbinom(10*12, 1, 0.5)*rpois(10*12, 3), 12, 10)
algos <- make.commsim()
a <- t(sapply(algos, evalfun, x=x, n=10))
print(as.table(ifelse(a==1,1,0)), zero.print = ".")

Run the code above in your browser using DataLab