Learn R Programming

Treestats

Measuring properties of phylogenetic trees

The treestats R package contains rapid, C++ based, functions to calculate summary statistics on phylogenies. For some functions (but not all, see below), the phylogenies are required to be ultrametric and/or binary.

Getting started

Installation

To get started, you can either install from CRAN or use the latest version from GitHub:

install.packages("treestats") # install from CRAN

# use the devtools package to install latest version from GitHub:
install.packages("devtools")
devtools::install_github("thijsjanzen/treestats")

Basic usage

Given a tree (for example a simulated tree, as in the code example), you can either access individual statistics, or calculate all currently implemented statistics:

focal_tree   <- ape::rphylo(n = 10, birth = 1, death = 0)
colless_stat <- treestats::colless(focal_tree)
all_stats    <- treestats::calc_all_stats(focal_tree)

List of statistics

The following summary statistics are included:

Rcpp

For all of these statistics, the package provides Rcpp versions that are much, much faster than their R sister functions. Furthermore, some additional functions have been improved as well:

  • ape::branching.times
  • DDD::phylo2L
  • DDD::L2phylo

C++ Library

For the Rcpp improved summary statistics (excluding statistics that rely on the calculation of eigen values, as these rely on the Rcpp independent Eigen code), R independent C++ code is provided in the inst/include folder. These can be independently linked by adding the treestats package in the DESCRIPTION in both the LinkingTo and Depends fields. Then, in your package, you can also calculate these functions.

Please note that for all functions, there are two versions available: 1) based on input of a phylo object, which is typically one 2-column matrix containing all edges, and a vector containing the edge lengths (depending on which information is required to calculate the statistic). 2) based on input of an Ltable (Lineage table), which is a 4-column matrix containing information on each species, being 1) birth time, 2) parent species, 3) species label and 4) death time (or -1 if extant).

Ltable input can be useful when summary statistics are required for more complicated simulation models.

Copy Link

Version

Install

install.packages('treestats')

Monthly Downloads

251

Version

1.70.5

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Thijs Janzen

Last Published

August 26th, 2024

Functions in treestats (1.70.5)

b1

B1 metric
avg_vert_depth

Average vertex depth metric
colless_corr

Corrected Colless index of (im)balance.
colless

Colless index of (im)balance.
gamma_statistic

Gamma statistic
imbalance_steps

Imbalance steps index
create_fully_unbalanced_tree

Create an unbalanced tree (caterpillar tree)
crown_age

Crown age of a tree.
eigen_centrality

Eigen vector centrality
entropy_j

Intensive quadratic entropy statistic J.
max_width

Maximum width of branch depths.
j_one

J^1 index.
l_to_phylo

Convert an L table to phylo object
max_ladder

Maximum ladder index
max_betweenness

Maximum betweenness centrality.
ew_colless

Equal weights Colless index of (im)balance.
colless_quad

Quadratic Colless index of (im)balance.
nLTT

Normalized LTT statistic
max_closeness

Maximum closeness
mean_branch_length

Mean branch length of a tree, including extinct branches.
mw_over_md

Maximum width of branch depths divided by the maximum depth
create_fully_balanced_tree

Create a fully balanced tree
mean_branch_length_ext

Mean length of external branch lengths of a tree, e.g. of branches leading to a tip.
diameter

Diameter statistic
minmax_adj

Adjancency Matrix properties
make_unbalanced_tree

Stepwise increase the imbalance of a tree
mean_pair_dist

Mean Pairwise distance
ltable_to_newick

Convert an L table to newick string
four_prong

Four prong index
double_cherries

Double Cherry index
pitchforks

Number of pitchforks
pigot_rho

Pigot's rho
stairs

Stairs index
stairs2

Stairs2 index
rebase_ltable

a function to modify an ltable, such that the longest path in the phylogeny is a crown lineage.
laplacian_spectrum

Laplacian spectrum statistics, from RPANDA
max_del_width

Maximum difference of widths of a phylogenetic tree
psv

Phylogenetic Species Variability.
var_branch_length

Variance of branch lengths of a tree, including extinct branches.
treestats-package

Collection of phylogenetic tree statistics
var_branch_length_ext

Variance of external branch lengths of a tree, e.g. of branches leading to a tip.
mean_branch_length_int

Mean length of internal branches of a tree, e.g. of branches not leading to a tip.
var_branch_length_int

Variance of internal branch lengths of a tree, e.g. of branches not leading to a tip.
sackin

Sackin index of (im)balance.
rquartet

Rquartet index.
list_statistics

Provides a list of all available statistics in the package
mean_i

Mean I statistic.
root_imbalance

Root imbalance
rogers

Rogers J index of (im)balance.
sym_nodes

Symmetry nodes metric
tot_coph

Total cophenetic index.
max_depth

Maximum depth metric
wiener

Wiener index
nLTT_base

Reference nLTT statistic
number_of_lineages

Number of tips of a tree, including extinct tips.
tree_height

Height of a tree.
treeness

Treeness statistic
var_pair_dist

Variance of all pairwise distances.
var_leaf_depth

Variance of leaf depth statistic
minmax_laplace

Laplacian Matrix properties
mntd

Mean Nearest Taxon distance
phylo_to_l

Function to generate an ltable from a phy object.
phylogenetic_diversity

Phylogenetic diversity at time point t
tot_internal_path

Total internal path length
tot_path_length

Total path length
average_leaf_depth

Average leaf depth statistic. The average leaf depth statistic is a normalized version of the Sackin index, normalized by the number of tips.
blum

Blum index of (im)balance.
ILnumber

ILnumber
branching_times

Branching times of a tree
avg_ladder

Average ladder index
beta_statistic

Aldous' beta statistic.
b2

B2 metric
area_per_pair

Area per pair index
calc_all_stats

Apply all available tree statistics to a single tree
calc_topology_stats

Calculate all topology based statistics for a single tree
calc_brts_stats

Apply all tree statistics related to branching times to a single tree.
cherries

Cherry index