dispRity.metric: Disparity metrics

Description

Different implemented disparity metrics.

Usage

dimension.level3.fun(matrix, ...)
dimension.level2.fun(matrix, ...)
dimension.level1.fun(matrix, ...)
between.groups.fun(matrix, matrix2, ...)

Arguments

matrix

A matrix.

...

Optional arguments to be passed to the function. Usual optional arguments are method for specifying the method for calculating distance passed to vegdist (e.g. method = "euclidean" - default - or method = "manhattan") or k.root to scale the result using the eqnkth root. See details below for available optional arguments for each function.

matrix2

Optional, a second matrix for metrics between groups.

Details

These are inbuilt functions for calculating disparity. See make.metric for details on dimension.level3.fun, dimension.level2.fun, dimension.level1.fun and between.groups.fun. The dimensions levels (1, 2 and 3) can be seen as similar to ranks in linear algebra.

The currently implemented dimension-level 1 metrics are:

convhull.volume: calculates the convex hull hypervolume of a matrix (calls convhulln(x, options = "FA")$vol).
- Both convhull functions call the convhulln function with the "FA" option (computes total area and volume).
- WARNING: both convhull functions can be computationally intensive above 10 dimensions!
convhull.surface: calculates the convex hull hypersurface of a matrix (calls convhulln(x, options = "FA")$area).
diagonal: calculates the longest distance in the ordinated space.
- WARNING: This function is the generalisation of Pythagoras' theorem and thus works only if each dimensions are orthogonal to each other.
ellipse.volume: calculates the ellipsoid volume of a matrix.
- WARNING: this function assumes that the input matrix is ordinated and calculates the matrix' eigen values from the matrix as abs(apply(var(matrix),2, sum)) (which is equivalent to eigen(var(matrix))$values but faster). These values are the correct eigen values for any matrix but differ from the ones output from cmdscale and pcoa because these later have their eigen values multiplied by the number of elements - 1 (i.e. abs(apply(var(matrix),2, sum)) * nrow(matrix) -1 ). Specific eigen values can always be provided manually through ellipse.volume(matrix, eigen.value = my_val) (or dispRity(matrix, metric = ellipse.volume, eigen.value = my_val)).
func.div: The functional divergence (Vill'eger et al. 2008): the ratio of deviation from the centroid (this is similar to FD::dbFD()$FDiv).
func.eve: The functional evenness (Vill'eger et al. 2008): the minimal spanning tree distances evenness (this is similar to FD::dbFD()$FEve). If the matrix used is not a distance matrix, the distance method can be passed using, for example method = "euclidean" (default).
mode.val: calculates the modal value of a vector.
n.ball.volume: calculate the volume of the minimum n-ball (if sphere = TRUE) or of the ellipsoid (if sphere = FALSE).

See also mean, median, sum or prod for commonly used summary metrics.

The currently implemented dimension-level 2 metrics are:

ancestral.dist: calculates the distance between each tip and node and their ancestral. This function needs either (1) matrix/list from nodes.coordinates; or a tree ("phylo") and full ("logical") argument to calculate the node coordinates for the direct descendants (full = FALSE) or all descendants down to the root (full = TRUE). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
angles: calculates the angles of the main axis of variation per dimension in a matrix. The angles are calculated using the least square algorithm from the lm function. The unit of the angle can be changed through the unit argument (either "degree" (default), radian or slope) and a base angle to measure the angle from can be passed through the base argument (by default base = 0, measuring the angle from the horizontal line (not that the base argument has to be passed in the same unit as unit). When estimating the slope through lm, you can use the option significant to only consider significant slopes (TRUE) or not (FALSE - default).
centroids: calculates the distance between each row and the centroid of the matrix (Lalibert'e 2010). This function can take an optional arguments centroid for defining the centroid (if missing (default), the centroid of the matrix is used). This argument can be either a subset of coordinates matching the matrix's dimensions (e.g. c(0, 1, 2) for a matrix with three columns) or a single value to be the coordinates of the centroid (e.g. centroid = 0 will set the centroid coordinates to c(0, 0, 0) for a three dimensional matrix). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
deviations: calculates the minimal Euclidean distance between each element in and the hyperplane (or line if 2D, or a plane if 3D). You can specify equation of hyperplane of d dimensions in the $intercept + ax + by + ... + nd = 0$ format. For example the line $y = 3x + 1$ should be entered as c(1, 3, -1) or the plane $x + 2y - 3z = 44$ as c(44, 1, 2, -3,). If missing the hyperplane (default) is calculated using a least square regression using a gaussian glm. Extract arguments can be passed to glm through .... When estimating the hyperplane, you can use the option significant to only consider significant slopes (TRUE) or not (FALSE - default).
displacements: calculates the ratio between the distance to the centroid (see centroids above) and the distance from a reference (by default the origin of the space). The reference can be changed through the reference argument. NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
neighbours: calculates the distance to a neighbour (Foote 1990). By default this is the distance to the nearest neighbour (which = min) but can be set to any dimension level - 1 function (e.g. which = mean gives the distance to the most average neighbour). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
pairwise.dist: calculates the pairwise distance between elements - calls vegdist(matrix, method = method, diag = FALSE, upper = FALSE, ...) (Foote 1990). The distance type can be changed via the method argument (see vegdist - default: method = "euclidean"). This function outputs a vector of pairwise comparisons in the following order: d(A,B), d(A,C), d(B,C) for three elements A, B and C. NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.
quantiles: calculates the quantile range of each axis of the matrix. The quantile can be changed using the quantile argument (default is quantile = 95, i.e. calculating the range on each axis that includes 95% of the data). An optional argument, k.root, can be set to TRUE to scale the ranges by using its $kth$ root (where $k$ are the number of dimensions). By default, k.root = FALSE.
radius: calculates a distance from the centre of each axis. The type argument is the function to select which distance to calculate. By default type = max calculates the maximum distance between the elements and the centre for each axis (i.e. the radius for each dimensions)
ranges: calculates the range of each axis of the matrix (Wills 2001). An optional argument, k.root, can be set to TRUE to scale the ranges by using its $kth$ root (where $k$ are the number of dimensions). By default, k.root = FALSE.
variances: calculates the variance of each axis of the matrix (Wills 2001). This function can also take the k.root optional argument described above.
span.tree.length: calculates the length of the minimum spanning tree (see spantree). This function can get slow with big matrices. To speed it up, one can directly use distance matrices as the multidimensional space.

The currently implemented between.groups metrics are:

group.dist: calculates the distance between two groups (by default, this is the minimum euclidean vector norm distance between groups). Negative distances are considered as 0. This function must intake two matrices (matrix and matrix2) and the quantiles to consider. For the minimum distance between two groups, the 100
point.dist: calculates the distance between matrix and a point calculated from matrix2. By default, this point is the centroid of matrix2. This can be changed by passing a function to be applied to matrix2 through the point argument (for example, for the centroid: point.dist(..., point = colMeans)). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.

When used in the dispRity function, optional arguments are declared after the metric argument: for example dispRity(data, metric = centroids, centroid = 0, method = "manhattan")

References

Donohue I, Petchey OL, Montoya JM, Jackson AL, McNally L, Viana M, Healy K, Lurgi M, O'Connor NE, Emmerson MC. 2013. On the dimensionality of ecological stability. Ecology letters. 16(4):421-9.

Lalibert'e E, Legendre P. 2010. A distance-based framework for measuring functional diversity from multiple traits. Ecology, 91(1), pp.299-305.

Vill'eger S, Mason NW, Mouillot D. 2008. New multidimensional functional diversity indices for a multifaceted framework in functional ecology. Ecology. 89(8):2290-301.

Wills MA. 2001. Morphological disparity: a primer. In Fossils, phylogeny, and form (pp. 55-144). Springer, Boston, MA.

Foote, M. 1990. Nearest-neighbor analysis of trilobite morphospace. Systematic Zoology, 39(4), pp.371-382.

Examples

Run this code

# NOT RUN {
## A random matrix
dummy_matrix <- matrix(rnorm(90), 9, 10)

## ancestral.dist
## A random tree with node labels
rand_tree <- rtree(5) ; rand_tree$node.label <- paste0("n", 1:4)
## Adding the tip and node names to the matris
rownames(dummy_matrix) <- c(rand_tree$tip.label, rand_tree$node.label)
## Calculating the direct ancestral nodes
direct_anc_centroids <- nodes.coordinates(dummy_matrix, rand_tree, full = FALSE)
## Calculating all the ancestral nodes
all_anc_centroids <- nodes.coordinates(dummy_matrix, rand_tree, full = TRUE)
## Calculating the distances from the direct ancestral nodes
ancestral.dist(dummy_matrix, nodes.coords = direct_anc_centroids)
## Calculating the distances from all the ancestral nodes
ancestral.dist(dummy_matrix, nodes.coords = all_anc_centroids)

## angles
## The angles in degrees of each axis
angles(dummy_matrix)
## The angles in slope from the 1:1 slope (Beta = 1)
angles(dummy_matrix, unit = "slope", base = 1)

## centroids
## Distances between each row and centroid of the matrix
centroids(dummy_matrix)
## Distances between each row and an arbitrary point
centroids(dummy_matrix, centroid = c(1,2,3,4,5,6,7,8,9,10))
## Distances between each row and the origin
centroids(dummy_matrix, centroid = 0)

## convhull.surface
## Making a matrix with more elements than dimensions (for convhull)
thinner_matrix <- matrix(rnorm(90), 18, 5)
## Convex hull hypersurface of a matrix
convhull.surface(thinner_matrix)

## convhull.volume
## Convex hull volume of a matrix
convhull.volume(thinner_matrix)

## deviations
## The deviations from the least square hyperplane
deviations(dummy_matrix)
## The deviations from the plane between the x and y axis
deviations(dummy_matrix, hyperplane = c(0,1,1,0,0,0,0,0,0,0,0))

## diagonal
## Matrix diagonal
diagonal(dummy_matrix) # WARNING: only valid if the dimensions are orthogonal

## displacements
## displacement ratios (from the centre)
displacements(dummy_matrix)
## displacement ratios (from an arbitrary point)
displacements(dummy_matrix, reference = c(1,2,3,4,5,6,7,8,9,10))
## displacement ratios from the centre (manhattan distance)
displacements(dummy_matrix, method = "manhattan")

## ellipse.volume
## Ellipsoid volume of a matrix
ellipse.volume(dummy_matrix)
## Calculating the same volume with provided eigen values
ordination <- prcomp(dummy_matrix)
## Calculating the ellipsoid volume
ellipse.volume(ordination$x, eigen.value = ordination$sdev^2)

## func.div
## Functional divergence
func.div(dummy_matrix)

## func.eve
## Functional evenness
func.eve(dummy_matrix) 
## Functional evenness (based on manhattan distances)
func.eve(dummy_matrix, method = "manhattan")

## group.dist
## The distance between groups
dummy_matrix2 <- matrix(runif(40, min = 2, max = 4), 4, 10)
## The minimum distance between both groups
group.dist(dummy_matrix, dummy_matrix2)
## The distance between both groups' centroids
group.dist(dummy_matrix, dummy_matrix2, probs = 0.5)
## The minimum distance between the 50% CI of each group
group.dist(dummy_matrix, dummy_matrix2, probs = c(0.25, 0.75))

## mode.val
## Modal value of a vector
mode.val(dummy_matrix)

## neighbours
## The nearest neighbour euclidean distances
neighbours(dummy_matrix)
## The furthest neighbour manhattan distances
neighbours(dummy_matrix, which = max, method = "manhattan")

## pairwise.dist
## The pairwise distance
pairwise.dist(dummy_matrix)
## The average squared pairwise distance
mean(pairwise.dist(dummy_matrix)^2)
## equal to:
geiger::disparity(data = dummy_matrix)

## point.dist
## The distances from the rows dummy_matrix
## to the centroids of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2)
## The average distances from dummy_matrix
## to the centroids of dummy_matrix2
mean(point.dist(dummy_matrix, dummy_matrix2))
## The manhattan distance from the rows dummy_matrix
## to the standard deviation of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2, point = sd, method = "manhattan")

## quantiles
## The 95 quantiles
quantiles(dummy_matrix)
## The 100 quantiles (which are equal to the ranges)
quantiles(dummy_matrix, quantile = 100) == ranges(dummy_matrix) # All TRUE

## radius
## The maximal radius of each axis (maximum distance from centre of each axis)
radius(dummy_matrix)

## ranges
## ranges of each column in a matrix
ranges(dummy_matrix)
## ranges of each column in the matrix corrected using the kth root
ranges(dummy_matrix, k.root = TRUE)

## span.tree.length
## Minimum spanning tree length (default)
span.tree.length(dummy_matrix)
## Minimum spanning tree length from a distance matrix (faster)
distance <- as.matrix(dist(dummy_matrix))
span.tree.length(distance)
## Minimum spanning tree length based on Manhattan distance
span.tree.length(dummy_matrix, method = "manhattan")
span.tree.length(as.matrix(dist(dummy_matrix, method = "manhattan"))) # Same

## variances
## variances of a each column in the matrix
variances(dummy_matrix)
## variances of a each column in the matrix corrected using the kth root
variances(dummy_matrix, k.root = TRUE)


# }