TreeDist (version 1.1.1)

median.multiPhylo: Median of a set of trees

Description

Calculate the single binary tree that represents the geometric median -- an 'average' -- of a forest of tree topologies.

Usage

# S3 method for multiPhylo
median(
  x,
  na.rm = FALSE,
  Distance = ClusteringInfoDistance,
  index = FALSE,
  breakTies = TRUE,
  ...
)

Arguments

x

Object of class multiPhylo containing phylogenetic trees.

na.rm, …

Unused; included for consistency with default function..

Distance

Function to calculate distances between each pair of trees in x.

index

Logical: if TRUE, return the index of the median tree(s); if FALSE, return the tree itself.

breakTies

Logical: if TRUE, return a single tree with the minimum score; if FALSE, return all tied trees.

Value

median() returns an object of class phylo corresponding to the geometric median of a set of trees: that is, the tree whose average distance from all other trees in the set is lowest. If multiple trees tie in their average distance, the first will be returned, unless breakTies = FALSE, in which case an object of class multiPhylo containing all such trees will be returned.

Details

The geometric median is the tree that exhibits the shortest average distance from each other tree topology in the set. It represents an 'average' of a set of trees, though note that an unsampled tree may be closer to the geometric 'centre of gravity' of the input set -- such a tree would not be considered.

The result will depend on the metric chosen to calculate distances between tree topologies. In the absence of a natural metric of tree topologies, the default choice is ClusteringInfoDistance() -- which discards branch length information. If specifying a different function, be sure that it returns a difference, rather than a similarity.

See Also

Consensus methods: ape::consensus(), TreeTools::ConsensusWithout()

Examples

Run this code
# NOT RUN {
library('TreeTools', quietly = TRUE, warn.conflicts = FALSE)
tenTrees <- as.phylo(1:10, nTip = 8)

# Default settings:
median(tenTrees)

# Robinson-Foulds distances include ties:
median(tenTrees, Distance = RobinsonFoulds, breakTies = FALSE)

# Be sure to use a distance function, rather than a similarity:
NyeDistance <- function (...) NyeSimilarity(..., similarity = FALSE)
median(tenTrees, Distance = NyeDistance)

# To analyse a list of trees that is not of class multiPhylo:
treeList <- lapply(1:10, as.phylo, nTip = 8)
class(treeList)
median(structure(treeList, class = 'multiPhylo'))
# }

Run the code above in your browser using DataCamp Workspace