mrpp: Multi Response Permutation Procedure of Within- versus Among-Group Dissimilarities

Description

Multiple Response Permutation Procedure (MRPP) provides a test of whether there is a significant difference between two or more groups of sampling units.

Usage

mrpp(dat, grouping, permutations = 999, distance = "euclidean",
     weight.type = 1, strata)
meandist(dist, grouping, ...)
## S3 method for class 'meandist':
summary(object, ...)
## S3 method for class 'meandist':
plot(x, cluster = "average", ...)

Arguments

dat

data matrix or data frame in which rows are samples and columns are response variable(s), or a dissimilarity object or a symmetric square matrix of dissimilarities.

grouping

Factor or numeric index for grouping observations.

permutations

Number of permutations to assess the significance of the MRPP statistic, $delta$.

distance

Choice of distance metric that measures the dissimilarity between two observations . See vegdist for options. This will be used if dat was not a dissimilarity structure of a symmet

weight.type

choice of group weights. See Details below for options.

strata

An integer vector or factor specifying the strata for permutation. If supplied, observations are permuted only within the specified strata.

dist

A dist object of dissimilarities, such as produced by functions dist, vegdist or

object, x

A meandist result object.

cluster

A clustering method for the hclust function. Any hclust method can be used, but perhaps only "average" and "single" make sense.

...

Further arguments passed to functions.

Value

The function returns a list of class mrpp with following items:
callFunction call.
deltaThe overall weighted mean of group mean distances.
E.deltaexpected delta, under the null hypothesis of no group structure. This is the mean of original dissimilarities.
CSClassification strength (Van Sickle 1997) with weight.type = 3 and NA with other weights.
nNumber of observations in each class.
classdeltaMean dissimilarities within classes. The overall $\delta$ is the weighted average of these values with given weight.type
.
PvalueSignificance of the test.
AA chance-corrected estimate of the proportion of the distances explained by group identity; a value analogous to a coefficient of determination in a linear model.
distanceChoice of distance metric used; the "method" entry of the dist object.
weight.typeThe choice of group weights used.
boot.deltasThe vector of "permuted deltas," the deltas calculated from each of the permuted datasets.
permutationsThe number of permutations used.

Details

Multiple Response Permutation Procedure (MRPP) provides a test of whether there is a significant difference between two or more groups of sampling units. This difference may be one of location (differences in mean) or one of spread (differences in within-group distance). Function mrpp operates on a data.frame matrix where rows are observations and responses data matrix. The response(s) may be uni- or multivariate. The method is philosophically and mathematically allied with analysis of variance, in that it compares dissimilarities within and among groups. If two groups of sampling units are really different (e.g. in their species composition), then average of the within-group compositional dissimilarities ought to be less than the average of the dissimilarities between two random collection of sampling units drawn from the entire population.

The mrpp statistic $\delta$ is simply the overall weighted mean of within-group means of the pairwise dissimilarities among sampling units. The correct choice of group weights is currently not clear. The mrpp function offers three choices: (1) group size ($n$), (2) a degrees-of-freedom analogue ($n-1$), and (3) a weight that is the number of unique distances calculated among $n$ sampling units ($n(n-1)/2$).

The mrpp algorithm first calculates all pairwise distances in the entire dataset, then calculates $\delta$. It then permutes the sampling units and their associated pairwise distances, and recalculates a $\delta$ based on the permuted data. It repeats the permutation step permutations times. The significance test is simply the fraction of permuted deltas that are less than the observed delta, with a small sample correction. The function also calculates the change-corrected within-group agreement $A = 1 -\delta/E(\delta)$, where $E(\delta)$ is the expected $\delta$ assessed as the average of permutations.

With weight.type = 3, the function also calculates classification strength (Van Sickle 1997) which is defined as the difference between average between group dissimilarities and within group dissimilarities. With weight.type = 3 the classification strength is a simple transformation of $\delta$, and has the same permutation significance.

If the first argument dat can be interpreted as dissimilarities, they will be used directly. In other cases the function treats dat as observations, and uses vegdist to find the dissimilarities. The default distance is Euclidean as in the traditional use of the method, but other dissimilarities in vegdist also are available.

Function meandist calculates a matrix of mean within-cluster dissimilarities (diagonal) and between-cluster dissimilarites (off-diagonal elements), and an attribute n of grouping counts. Function summary finds the within-class, between-class and overall means of these dissimilarities, and the MRPP statistics with all weight.type options and the classification strength. The function does not allow significance tests for these statistics, but you must use mrpp with appropriate weight.type. Function plot draws a dendrogram of the result matrix with given cluster method (see hclust). The terminal segments hang to within-cluster dissimilarity. If some of the clusters is more heterogeneous than the combined class, the leaf segment is reversed.

References

B. McCune and J. B. Grace. 2002. Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon, USA.

P. W. Mielke and K. J. Berry. 2001. Permutation Methods: A Distance Function Approach. Springer Series in Statistics. Springer.

J. Van Sickle 1997. Using mean similarity dendrograms to evaluate classifications. Journal of Agricultural, Biological, and Environmental Statistics 2:370-388.

Examples

Run this code

data(dune)
data(dune.env)
dune.mrpp <- mrpp(dune, dune.env$Management)
dune.mrpp

# Save and change plotting parameters
def.par <- par(no.readonly = TRUE)
layout(matrix(1:2,nr=1))

plot(dune.ord <- metaMDS(dune), type="text", display="sites" )
ordihull(dune.ord, dune.env$Management)

with(dune.mrpp, {
  fig.dist <- hist(boot.deltas, xlim=range(c(delta,boot.deltas)), 
                 main="Test of Differences Among Groups")
  abline(v=delta); 
  text(delta, 2*mean(fig.dist$counts), adj = -0.5,
     expression(bold(delta)), cex=1.5 )  }
)
par(def.par)
## meandist
dune.md <- meandist(vegdist(dune), dune.env$Management)
dune.md
summary(dune.md)
plot(dune.md)

Run the code above in your browser using DataLab