prabclust: Clustering of species ranges from presence-absence matrices (mixture method)

Description

Clusters a presence-absence matrix object by calculating an MDS from the distances, and applying maximum likelihood Gaussian mixtures clustering with "noise" (package mclust) to the MDS points. The solution is plotted. A standard execution (using thye default distance of prabinit) will be

prabmatrix <- prabinit(file="path/prabmatrixfile",
    neighborhood="path/neighborhoodfile")

clust <- prabclust(prabmatrix) print(clust) Note: Data formats are described on the prabinit help page. You may also consider the example datasets kykladspecreg.dat and nb.dat. Take care of the parameter rows.are.species of prabinit. Note: prabclust calls the function mclustBIC in package mclust. Its use is protected by a special license, see http://www.stat.washington.edu/mclust/license.txt, particularly point 6. An alternative is the use of hprabclust.

Usage

prabclust(prabobj, mdsmethod = "classical", mdsdim = 4, nnk =
ceiling(prabobj$n.species/40), nclus = 0:9, modelid = "all", permutations=0)
## S3 method for class 'prabclust':
print(x, bic=FALSE, ...)

Arguments

prabobj

object of class prab as generated by prabinit. Presence-absence data to be analyzed.

mdsmethod

"classical", "kruskal", or "sammon". The MDS method to transform the distances to data points. "classical" indicates metric MDS by function cmdscale, "kruskal" is

mdsdim

integer. Dimension of the MDS points.

nnk

integer. Number of nearest neighbors to determine the initial noise estimation by NNclean. nnk=0 fits the model without a noise component.

nclus

vector of integers. Numbers of clusters to perform the mixture estimation.

modelid

string. Model name for mclustBIC (see the corresponding help page; all models or combinations of models mentioned there are possible). modelid="all" compares all possible models. Additionally, "noVVV" is

permutations

integer. It has been found occasionally that depending on the order of observations the algorithms isoMDS and mclustBIC converge to different solutions. This is because these methods require an ordering of the distanc

object of class prabclust. Output of prabclust.

bic

logical. If TRUE, information about the BIC criterion to choose the model is displayed.

...

necessary for summary method.

Value

print.prabclust does not produce output. prabclust generates an object of class prabclust. This is a list with components
clusteringvector of integers indicating the cluster memberships of the species. Noise can be recognized by output component symbols.
clustsummaryoutput object of summary.mclustBIC. A list giving the optimal (according to BIC) parameters, conditional probabilities `z', and loglikelihood, together with the associated classification and its uncertainty. Note that the numbering of clusters may differ from clustering, see csreorder.
bicsummaryoutput object of mclustBIC. Bayesian Information Criterion for the specified mixture models and numbers of clusters.
pointsnumerical matrix. MDS configuration.
nnksee above.
mdsdimsee above.
mdsmethodsee above.
symbolsvector of characters, similar to clustering, but indicating estimated noise and points belonging to one-point-components (which should be interpreted as some kind of noise as well) by "N".
permchangelogical. If TRUE, permutations>0 has been used and the best solution is different from the one obtained by the standard ordering. (This is just for information and has no further operational consequences.)
csreorderinteger vector. This gives the numbering of the components in clustsummary relative to clustering. Usually, clustering and symbols will be used, but in order to use the information in clustsummary (parameter values, posterior assignment probabilities etc.), it has to be taken into account that cluster no. 1 in clustering corresponds to cluster no. csreorder[1] in clustsummary and so on. Noise, if present, is numbered 0 in clustering as well as clustsummary.

References

Fraley, C. and Raftery, A. E. (1998) How many clusters? Which clusterin method? - Answers via Model-Based Cluster Analysis. Computer Journal 41, 578-588. Hennig, C. and Hausdorf, B. (2004) Distance-based parametric bootstrap tests for clustering of species ranges. Computational Statistics and Data Analysis 45, 875-896. http://stat.ethz.ch/Research-Reports/110.html.

Examples

Run this code

data(kykladspecreg)
data(nb)
set.seed(1234)
x <- prabinit(prabmatrix=kykladspecreg, neighborhood=nb)
# If you want to use your own ASCII data files, use
# x <- prabinit(file="path/prabmatrixfile",
# neighborhood="path/neighborhoodfile")
print(prabclust(x))

Run the code above in your browser using DataLab