merge_clusters
is built on the premise that a good clustering solution (i.e. a classification) should provide information about the composition and abundance of the multivariate data it is classifying. A natural way to formalize this is with a predictive model, where group membership (clusters) is the predictor, and the multivariate data (site by variables matrix) is the response. merge_clusters
fits linear models to each pairwise combination of a given set of clusters, and calculates their delta sum-of-AIC (that is, to the corresponding null model). The smallest delta AIC is taken to be the cluster pair that is most similar, so it is merged, and the process is repeated. Lyons et al. (2016) provides background, a detailed description of the methodology, and application of delta AIC on both real and simulated ecological multivariate abundance data.
At present, merge_clusters
supports the following error distributions for model fitting:
Gaussian (LM)
Negative Binomial (GLM with log link)
Poisson (GLM with log link)
Binomial (GLM with cloglog link for binary data, logit link otherwise)
Ordinal (Proportional odds model with logit link)
Gaussian LMs should be used for 'normal' data. Negative Binomial and Poisson GLMs should be used for count data. Binomial GLMs should be used for binary and presence/absence data (when K=1
), or trials data (e.g. frequency scores). If Binomial regression is being used with K>1
, then data
should be numerical values between 0 and 1, interpreted as the proportion of successful cases, where the total number of cases is given by K
(see Details in family
). Ordinal regression should be used for ordinal data, for example, cover-abundance scores. For ordinal regression, data should be supplied as either 1) factors, with the appropriate ordinal level order specified (see levels
) or 2) numeric, which will be coerced into a factor with levels ordered in numerical order (e.g. cover-abundance/numeric response scores). LMs fit via manylm
; GLMs fit via manyglm
; proportional odds model fit via clm
.