# dbFD

##### Distance-Based Functional Diversity Indices

`dbFD`

implements a flexible distance-based framework to compute multidimensional functional diversity (FD) indices. `dbFD`

returns the three FD indices of Villéger et al. (2008): functional richness (FRic), functional evenness (FEve), and functional divergence (FDiv), as well functional dispersion (FDis; Laliberté and Legendre 2010), Rao's quadratic entropy (Q) (Botta-Dukát 2005), a posteriori functional group richness (FGR) (Petchey and Gaston 2006), and the community-level weighted means of trait values (CWM; e.g. Lavorel et al. 2008). Some of these FD indices consider species abundances. `dbFD`

includes several options for flexibility.

- Keywords
- multivariate

##### Usage

`dbFD(x, a, w, w.abun = TRUE, stand.x = TRUE, ord = c("podani", "metric"), asym.bin = NULL, corr = c("sqrt", "cailliez", "lingoes", "none"), calc.FRic = TRUE, m = "max", stand.FRic = FALSE, scale.RaoQ = FALSE, calc.FGR = FALSE, clust.type = "ward", km.inf.gr = 2, km.sup.gr = nrow(x) - 1, km.iter = 100, km.crit = c("calinski", "ssi"), calc.CWM = TRUE, CWM.type = c("dom", "all"), calc.FDiv = TRUE, dist.bin = 2, print.pco = FALSE, messages = TRUE)`

##### Arguments

- x
- matrix or data frame of functional traits. Traits can be
`numeric`

,`ordered`

, or`factor`

. Binary traits should be`numeric`

and only contain 0 and 1.`character`

traits will be converted to`factor`

.`NA`

s are tolerated.`x`

can also be a species-by-species distance matrix of class`dist`

, in which case`NAs`

are not allowed.When there is only one trait,

`x`

can be also be a`numeric`

vector, an`ordered`

factor, or a unordered`factor`

.In all cases, species labels are required.

- a
- matrix containing the abundances of the species in
`x`

(or presence-absence, i.e. 0 or 1). Rows are sites and species are columns. Can be missing, in which case`dbFD`

assumes that there is only one community with equal abundances of all species.`NAs`

will be replaced by 0. The number of species (columns) in`a`

must match the number of species (rows) in`x`

. In addition, the species labels in`a`

and`x`

must be identical and in the same order. - w
- vector listing the weights for the traits in
`x`

. Can be missing, in which case all traits have equal weights. - w.abun
- logical; should FDis, Rao's Q, FEve, FDiv, and CWM be weighted by the relative abundances of the species?
- stand.x
- logical; if all traits are
`numeric`

, should they be standardized to mean 0 and unit variance? If not all traits are`numeric`

, Gower's (1971) standardization by the range is automatically used; see`gowdis`

for more details. - ord
- character string specifying the method to be used for ordinal traits (i.e.
`ordered`

).`"podani"`

refers to Eqs. 2a-b of Podani (1999), while`"metric"`

refers to his Eq. 3. Can be abbreviated. See`gowdis`

for more details. - asym.bin
- vector listing the asymmetric binary variables in
`x`

. See`gowdis`

for more details. - corr
- character string specifying the correction method to use when the species-by-species distance matrix cannot be represented in a Euclidean space. Options are
`"sqrt"`

,`"cailliez"`

,`"lingoes"`

, or`"none"`

. Can be abbreviated. Default is`"sqrt"`

. See ‘details’ section. - calc.FRic
- logical; should FRic be computed?
- m
- the number of PCoA axes to keep as ‘traits’ for calculating FRic (when FRic is measured as the convex hull volume) and FDiv. Options are: any integer $>1$,
`"min"`

(maximum number of traits that allows the $s >= 2^t$ condition to be met, where $s$ is the number of species and $t$ the number of traits), or`"max"`

(maximum number of axes that allows the $s > t$ condition to be met). See ‘details’ section. - stand.FRic
- logical; should FRic be standardized by the ‘global’ FRic that include all species, so that FRic is constrained between 0 and 1?
- scale.RaoQ
- logical; should Rao's Q be scaled by its maximal value over all frequency distributions? See
`divc`

. - calc.FGR
- logical; should FGR be computed?
- clust.type
- character string specifying the clustering method to be used to create the dendrogram of species for FGR. Options are
`"ward"`

,`"single"`

,`"complete"`

,`"average"`

,`"mcquitty"`

,`"median"`

,`"centroid"`

, and`"kmeans"`

. For`"kmeans"`

, other arguments also apply (`km.inf.fr`

,`km.sup.gr`

,`km.iter`

, and`km.crit`

). See`hclust`

and`cascadeKM`

for more details. - km.inf.gr
- the number of groups for the partition with the smallest number of groups of the cascade (min). Only applies if
`calc.FGR`

is`TRUE`

and`clust.type`

is`"kmeans"`

. See`cascadeKM`

for more details. - km.sup.gr
- the number of groups for the partition with the largest number of groups of the cascade (max). Only applies if
`calc.FGR`

is`TRUE`

and`clust.type`

is`"kmeans"`

. See`cascadeKM`

for more details. - km.iter
- the number of random starting configurations for each value of $K$. Only applies if
`calc.FGR`

is`TRUE`

and`clust.type`

is`"kmeans"`

. See`cascadeKM`

for more details. - km.crit
- criterion used to select the best partition. The default value is
`"calinski"`

(Calinski-Harabasz 1974). The simple structure index`"ssi"`

is also available. Only applies if`calc.FGR`

is`TRUE`

and`clust.type`

is`"kmeans"`

. Can be abbreviated. See`cascadeKM`

for more details. - calc.CWM
- logical; should the community-level weighted means of trait values (CWM) be calculated? Can be abbreviated. See
`functcomp`

for more details. - CWM.type
- character string indicating how nominal, binary and ordinal traits should be handled for CWM. See
`functcomp`

for more details. - calc.FDiv
- logical; should FDiv be computed?
- dist.bin
- only applies when
`x`

is a single unordered`factor`

, in which case`x`

is coded using dummy variables.`dist.bin`

is an integer between 1 and 10 specifying the appropriate distance measure for binary data. 2 (the default) refers to the simple matching coefficient (Sokal and Michener 1958). See`dist.binary`

for the other options. - print.pco
- logical; should the eigenvalues and PCoA axes be returned?
- messages
- logical; should warning messages be printed in the console?

##### Details

Typical usage is

dbFD(x, a, \dots)

If `x`

is a matrix or a data frame that contains only continuous traits, no `NAs`

, and that no weights are specified (i.e. `w`

is missing), a species-species Euclidean distance matrix is computed via `dist`

. Otherwise, a Gower dissimilarity matrix is computed via `gowdis`

. If `x`

is a distance matrix, it is taken as is.

When `x`

is a single trait, species with `NAs`

are first excluded to avoid `NAs`

in the distance matrix. If `x`

is a single continuous trait (i.e. of class `numeric`

), a species-species Euclidean distance matrix is computed via `dist`

. If `x`

is a single ordinal trait (i.e. of class `ordered`

), `gowdis`

is used and argument `ord`

applies. If `x`

is a single nominal trait (i.e. an unordered `factor`

), the trait is converted to dummy variables and a distance matrix is computed via `dist.binary`

, following argument `dist.bin`

.

Once the species-species distance matrix is obtained, `dbFD`

checks whether it is Euclidean. This is done via `is.euclid`

. PCoA axes corresponding to negative eigenvalues are imaginary axes that cannot be represented in a Euclidean space, but simply ignoring these axes would lead to biased estimations of FD. Hence in `dbFD`

one of four correction methods are used, following argument `corr`

. `"sqrt"`

simply takes the square root of the distances. However, this approach does not always work for all coefficients, in which case `dbFD`

will stop and tell the user to select another correction method. `"cailliez"`

refers to the approach described by Cailliez (1983) and is implemented via `cailliez`

. `"lingoes"`

refers to the approach described by Lingoes (1971) and is implemented via `lingoes`

. `"none"`

creates a distance matrix with only the positive eigenvalues of the Euclidean representation via `quasieuclid`

. See Legendre and Legendre (1998) and Legendre and Anderson (1999) for more details on these corrections.

Principal coordinates analysis (PCoA) is then performed (via `dudi.pco`

) on the *corrected* species-species distance matrix. The resulting PCoA axes are used as the new ‘traits’ to compute the three indices of Villéger et al. (2008): FRic, FEve, and FDiv. For FEve, there is no limit on the number of traits that can be used, so all PCoA axes are used. On the other hand, FRic and FDiv both rely on finding the minimum convex hull that includes all species (Villéger et al. 2008). This requires more species than traits. To circumvent this problem, `dbFD`

takes only a subset of the PCoA axes as traits via argument `m`

. This, however, comes at a cost of loss of information. The quality of the resulting reduced-space representation is returned by `qual.FRic`

, which is computed as described by Legendre and Legendre (1998) and can be interpreted as a $R^2$-like ratio.

In `dbFD`

, FRic is generally measured as the convex hull volume, but when there is only one continuous trait it is measured as the range (or the range of the ranks for an ordinal trait). Conversely, when only nominal and ordinal traits are present, FRic is measured as the number of unique trait value combinations in a community. FEve and FDiv, but not FRic, can account for species relative abundances, as described by Villéger et al. (2008).

Functional dispersion (FDis; Laliberté and Legendre 2010) is computed from the *uncorrected* species-species distance matrix via `fdisp`

. Axes with negatives eigenvalues are corrected following the approach of Anderson (2006). When all species have equal abundances (i.e. presence-absence data), FDis is simply the average distance to the centroid (i.e. multivariate dispersion) as originally described by Anderson (2006). Multivariate dispersion has been proposed as an index of beta diversity (Anderson et al. 2006). However, Laliberté and Legendre (2010) have extended it to a FD index. FDis can account for relative abundances by shifting the position of the centroid towards the most abundant species, and then computing a weighted average distance to this new centroid, using again the relative abundances as weights (Laliberté and Legendre 2010). FDis has no upper limit and requires at least two species to be computed. For communities composed of only one species, `dbFD`

returns a FDis value of 0. FDis is by construction unaffected by species richness, it can be computed from any distance or dissimilarity measure (Anderson et al. 2006), it can handle any number and type of traits (including more traits than species), and it is not strongly influenced by outliers.

Rao's quadratic entropy (Q) is computed from the *uncorrected* species-species distance matrix via `divc`

. See Botta-Dukát (2005) for details. Rao's Q is conceptually similar to FDis, and simulations (via `simul.dbFD`

) have shown high positive correlations between the two indices (Laliberté and Legendre 2010). Still, one potential advantage of FDis over Rao's Q is that in the unweighted case (i.e. with presence-absence data), it opens possibilities for formal statistical tests for differences in FD between two or more communities through a distance-based test for homogeneity of multivariate dispersions (Anderson 2006); see `betadisper`

for more details.

Functional group richness (FGR) is based on the classification of the species by the user from visual inspection of a dengrogram. Method `"kmeans"`

is also available by calling `cascadeKM`

. In that case, the Calinski-Harabasz (1974) criterion or the simple structure index (SSI) can be used to estimate the number of functional groups; see `cascadeKM`

for more details. FGR returns the number of functional groups per community, as well as the abundance of each group in each community.

The community-level means of trait values (CWM) is an index of functional composition (Lavorel et al. 2008), and is computed via `functcomp`

. Species with `NAs`

for a given trait are excluded for that trait.

##### Value

- nbsp
- vector listing the number of species in each community
- sing.sp
- vector listing the number of functionally singular species in each community. If all species are functionally different,
`sing.sp`

will be identical to`nbsp`

. - FRic
- vector listing the FRic of each community
- qual.FRic
- quality of the reduced-space representation required to compute FRic and FDiv.
- FEve
- vector listing the FEve of each community
- FDiv
- vector listing the FDiv of each community. Only returned if
`calc.FDiv`

is`TRUE`

. - FDis
- vector listing the FDis of each community
- RaoQ
- vector listing the Rao's quadratic entropy (Q) of each community
- FGR
- vector listing the FGR of each community. Only returned if
`calc.FGR`

is`TRUE`

. - spfgr
- vector specifying functional group membership for each species. Only returned if
`calc.FGR`

is`TRUE`

. - gr.abun
- matrix containing the abundances of each functional group in each community. Only returned if
`calc.FGR`

is`TRUE`

. - CWM
- data frame containing the community-level weighted trait means (CWM). Only returned if
`calc.CWM`

is`TRUE`

. - x.values
- eigenvalues from the PCoA. Only returned if
`print.pco`

is`TRUE`

. - x.axes
- PCoA axes. Only returned if
`print.pco`

is`TRUE`

.

##### Note

`dbFD`

borrows code from the `F_RED`

function of Villéger et al. (2008).

##### Warning

Users often report that `dbFD`

crashed during their analysis. Generally this occurs under Windows, and is almost always due to the computation of convex hull volumes. Possible solutions are to choose `calc.FRic = "FALSE"`

, or to reduce the dimensionality of the trait matrix using the `"m"`

argument.

##### References

Anderson, M. J. (2006) Distance-based tests for homogeneity of multivariate dispersions. *Biometrics* **62**:245-253.

Anderson, M. J., K. E. Ellingsen and B. H. McArdle (2006) Multivariate dispersion as a measure of beta diversity. *Ecology Letters* **9**:683-693.

Botta-Dukát, Z. (2005) Rao's quadratic entropy as a measure of functional diversity based on multiple traits. *Journal of Vegetation Science* **16**:533-540.

Cailliez, F. (1983) The analytical solution of the additive constant problem. *Psychometrika* **48**:305-310.

Calinski, T. and J. Harabasz (1974) A dendrite method for cluster analysis. *Communications in Statistics* **3**:1-27.

Gower, J. C. (1971) A general coefficient of similarity and some of its properties. *Biometrics* **27**:857-871.

Laliberté, E. and P. Legendre (2010) A distance-based framework for measuring functional diversity from multiple traits. *Ecology* **91**:299-305.

Lavorel, S., K. Grigulis, S. McIntyre, N. S. G. Williams, D. Garden, J. Dorrough, S. Berman, F. Quétier, A. Thebault and A. Bonis (2008) Assessing functional diversity in the field - methodology matters! *Functional Ecology* **22**:134-147.

Legendre, P. and M. J. Anderson (1999) Distance-based redundancy analysis: testing multispecies responses in multifactorial ecological experiments. *Ecological Monographs* **69**:1-24.

Legendre, P. and L. Legendre (1998) *Numerical Ecology*. 2nd English edition. Amsterdam: Elsevier.

Lingoes, J. C. (1971) Some boundary conditions for a monotone analysis of symmetric matrices. *Psychometrika* **36**:195-203.

Podani, J. (1999) Extending Gower's general coefficient of similarity to ordinal characters. *Taxon* **48**:331-340.

Sokal, R. R. and C. D. Michener (1958) A statistical method for evaluating systematic relationships. *The University of Kansas Scientific Bulletin* **38**:1409-1438.

Villéger, S., N. W. H. Mason and D. Mouillot (2008) New multidimensional functional diversity indices for a multifaceted framework in functional ecology. *Ecology* **89**:2290-2301.

##### See Also

`gowdis`

, `functcomp`

, `fdisp`

, `simul.dbFD`

, `divc`

, `treedive`

, `betadisper`

##### Examples

```
# mixed trait types, NA's
ex1 <- dbFD(dummy$trait, dummy$abun)
ex1
# add variable weights
# 'cailliez' correction is used because 'sqrt' does not work
w<-c(1, 5, 3, 2, 5, 2, 6, 1)
ex2 <- dbFD(dummy$trait, dummy$abun, w, corr="cailliez")
# if 'x' is a distance matrix
trait.d <- gowdis(dummy$trait)
ex3 <- dbFD(trait.d, dummy$abun)
ex3
# one numeric trait, one NA
num1 <- dummy$trait[,1] ; names(num1) <- rownames(dummy$trait)
ex4 <- dbFD(num1, dummy$abun)
ex4
# one ordered trait, one NA
ord1 <- dummy$trait[,5] ; names(ord1) <- rownames(dummy$trait)
ex5 <- dbFD(ord1, dummy$abun)
ex5
# one nominal trait, one NA
fac1 <- dummy$trait[,3] ; names(fac1) <- rownames(dummy$trait)
ex6 <- dbFD(fac1, dummy$abun)
ex6
# example with real data from New Zealand short-tussock grasslands
# 'lingoes' correction used because 'sqrt' does not work in that case
ex7 <- dbFD(tussock$trait, tussock$abun, corr = "lingoes")
## Not run:
# # calc.FGR = T, 'ward'
# ex7 <- dbFD(dummy$trait, dummy$abun, calc.FGR = T)
# ex7
#
# # calc.FGR = T, 'kmeans'
# ex8 <- dbFD(dummy$trait, dummy$abun, calc.FGR = T,
# clust.type = "kmeans")
# ex8
#
# # ward clustering to compute FGR
# ex9 <- dbFD(tussock$trait, tussock$abun,
# corr = "cailliez", calc.FGR = TRUE, clust.type = "ward")
# # choose 'g' for number of groups
# # 6 groups seems to make good ecological sense
# ex9
#
# # however, calinksi criterion in 'kmeans' suggests
# # that 6 groups may not be optimal
# ex10 <- dbFD(tussock$trait, tussock$abun, corr = "cailliez",
# calc.FGR = TRUE, clust.type = "kmeans", km.sup.gr = 10)
# ## End(Not run)
```

*Documentation reproduced from package FD, version 1.0-12, License: GPL-2*