NNIDist: Approximate Nearest Neighbour Interchange distance

Description

Use the approach of Li et al. (1996) to approximate the Nearest Neighbour Interchange distance (Robinson, 1971) between phylogenetic trees.

Usage

NNIDist(tree1, tree2 = tree1)
NNIDiameter(tree)

Arguments

tree1, tree2

Single trees of class phylo to undergo comparison.

tree

Object of supported class representing a tree or list of trees, or an integer specifying the number of leaves in a tree/trees.

Value

NNIDist() returns, for each pair of trees, a named vector containing three integers:

lower is a lower bound on the NNI distance, and corresponds to the RF distance between the trees.
tight_upper is an upper bound on the distance, based on calculated maximum diameters for trees with < 13 leaves. NA is returned if trees are too different to employ this approach.
loose_upper is a looser upper bound on the distance, using n log n + O(n).

NNIDiameter() returns a matrix specifying (bounds on) the diameter of the NNI distance metric on the specified tree(s). Columns correspond to:

liMin: $$n - 3$$, a lower bound on the diameter (Li et al. 1996);
fackMin: Lower bound on diameter following Fack et al. (2002), i.e. $$\log2{N!} / 4$$;
min: The larger of liMin and fackMin;
exact: The exact value of the diameter, where n < 13;
liMax: Upper bound on diameter following Li et al. (1996), i.e. $$n \log2{n} + \textrm{O}(n)$$;
fackMax: Upper bound on diameter following Fack et al. (2002), i.e. ($$N - 2$$) ceiling($$\log2{n}$$)
- N;
max: The smaller of liMax and fackMax;

where n is the number of leaves, and N the number of internal nodes, i.e. $$n - 2$$.

Details

In brief, this approximation algorithm works by identifying edges in one tree that do not match edges in the second. Each of these edges must undergo at least one NNI operation in order to reconcile the trees. Edges that match in both trees need never undergo an NNI operation, and divide each tree into smaller regions. By 'cutting' matched edges into two, a tree can be divided into a number of regions that solely comprise unmatched edges.

These regions can be viewed as separate trees that need to be reconciled. One way to reconcile these trees is to conduct a series of NNI operations that reduce a tree to a pectinate (caterpillar) tree, then to conduct an analogue of the mergesort algorithm. This takes at most n log n + O(n) NNI operations, and provides a loose upper bound on the NNI score. The maximum number of moves for an n-leaf tree (OEIS A182136) can be calculated exactly for small trees (Fack et al. 2002); this provides a tighter upper bound, but is unavailable for n > 12. NNIDiameter() reports the limits on this bound.

Leaves:	1	2	3	4	5	6	7	8	9	10	11	12	13
Diameter:	0	0	0	1	3	5	7	10	12	15	18	21	?

References

Fack2002TreeDist

Li1996TreeDist

Robinson1971TreeDist

Examples

Run this code

# NOT RUN {
library('TreeTools', quietly = TRUE, warn.conflicts = FALSE)

NNIDist(BalancedTree(7), PectinateTree(7))

NNIDist(BalancedTree(7), as.phylo(0:2, 7))
NNIDist(as.phylo(0:2, 7), PectinateTree(7))

NNIDist(list(bal = BalancedTree(7), pec = PectinateTree(7)),
        as.phylo(0:2, 7))

CompareAll(as.phylo(30:33, 8), NNIDist)
# }

Run the code above in your browser using DataLab