Learn R Programming

Welcome to DateLife’s R package GitHub repository!

What is datelife?

datelife is an R package that allows researchers and the general audience to obtain open scientific data on the age of any organism they are interested in, by retrieving organism ages from a database of dated phylogenetic trees (aka chronograms), that have been peer-reviewed and published as part of a scientific research article, in an indexed journal (Open Tree of Life’s tree store). As such, organism ages retrieved by datelife constitute state-of-the-art, peer-reviewed, public scientific knowledge, that can be accessed and reused by experts and non-experts in the field alike.

How can you use datelife?

You can install the datelife R package on your own computer and use it locally. You can find instructions for a local installation below.

If you do not want/have time to deal with installation and R code, you can use DateLife’s interactive website application. Note that the website is not live at the moment, apologies.

To learn more, please go to datelife’s documentation website.

README topics:

Local installation of the datelife R package

datelife’s most recent stable version can be installed with:

install.packages("datelife")

datelife’s previous stable versions are available for installation from the CRAN repository. For example, to install version 0.6.1, you can run:

devtools::install_version("datelife", version="0.6.1")

You can install datelife’s development version from its GitHub repository with:

devtools::install_github("phylotastic/datelife")

Citing datelife

If you use datelife for a publication, please cite the R package and the accompanying paper:

You can get these citations and the bibtex entry with:

citation("datelife")
toBibtex(citation("datelife"))

Feedback and Information for Developers

We welcome and encourage to post a GitHub issue with any comments, ideas and questions about datelife’s software and website. If you want to contribute with code directly, we welcome and encourage pull requests.

Function documentation:

Package and function documentation was generated with roxygen2:

roxygen2::roxygenise()

Styling code:

We used the package lintr to check for coding style:

lintr::lint_package()

Calculating test coverage:

Code coverage was calculated with the package covr:

cov <- covr::package_coverage()

usethis::use_data(cov, overwrite = TRUE)

You can see an interactive report of testing coverage:

covr::report(cov)

And, find code with zero coverage:

covr::zero_coverage(cov)

Generating datelife’s hexsticker:

Code used to generate current datelife’s logo hexsticker is in data-raw/hexsticker-current.R

Rendering the vignettes:

Vignettes are rendered automatically upon built. However, if you wish to see how they look rendered before releasing the package, you can do this with knitr::knit(). The following command renders the vignette Getting_started_with_datelife as html:

knitr::knit("vignettes/Getting_started_with_datelife.Rmd")

To update “pre-rendered” vignettes, follow this blog. For example, to render the vignette about making trees with BOLD, do:

knitr::knit("vignettes/making_bold_trees.Rmd.orig", output = "vignettes/making_bold_trees.Rmd")

Creating a documentation website for the package

Using pkgdown for this is quite straightforward and fun:

usethis::use_pkgdown()
pkgdown::build_site()

Preparing a CRAN release

Updating GitHub actions R CMD check

Run the following function from the package usethis to update R CMD Check on GitHub:

usethis::use_github_action_check_standard()

This downloads the standard R CMD check workflow from r-lib action examples.

Local checks

To be able to release to CRAN, the first step is to pass the checks locally. To run a local check, you can use the command R CMD check from your terminal. For that, change directories to the one above your working clone of the datelife repo:

cd ../

Generate a tar ball for your package by running R CMD build package-name:

R CMD build datelife

Finally, run R CMD check package-tar-ball on the tar ball that you just generated:

R CMD check --as-cran datelife_0.6.0.tar.gz
Remote checks

If you do not have access to different OS to test your package on, the rhub package allows remote testing on a variety of OS with the command:

rhub::check_for_cran()

For more rhub useful workflows, check out its documentation. For example

previous_checks <- rhub::list_package_checks(".",
                                  email = "sanchez.reyes.luna@gmail.com",
                                  howmany = 4)
group_check <- rhub::get_check(previous_checks$group[1])
group_check

cran_prep <- check_for_cran()
cran_prep$cran_summary()

To check for URL validity and on Windows OS, use:

devtools::check_win_release()
devtools::check_win_devel()

Releasing to CRAN

To submit to CRAN call:

devtools::release()

and answer the prompted questions. If the answer to all of these is yes, the package will be submitted to CRAN :rocket:

License

This package is free and open source software, licensed under GPL.

Acknowledgements

datelife has been developed as part of the phylotastic (NSF-funded) project, and is still under development.

Copy Link

Version

Install

install.packages('datelife')

Monthly Downloads

81

Version

0.6.8

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Luna L. Sanchez Reyes

Last Published

June 19th, 2023

Functions in datelife (0.6.8)

clean_taxon_info_children

Identify, extract and clean taxonomic children names from a taxonomy_taxon_info() output.
clean_ott_chronogram

Clean up some issues with Open Tree of Life chronograms For now it 1) checks unmapped taxa and maps them with tnrs_match.phylo, 2) roots the chronogram if unrooted
contributor_cache

Information on contributors, authors, study ids and clades from studies with chronograms in Open Tree of Life (Open Tree)
date_with_pbdb

Date with Paleobiology Database and paleotree.
congruify_and_mrca_multiPhylo

Congruify nodes of a tree topology to nodes from a source chronogram, and find the mrca nodes
congruify_and_check

Congruify and Check.
cluster_patristicmatrix

Cluster a patristic matrix into a tree with various methods.
datelife_use_datelifequery

Generate one or multiple chronograms for a set of taxon names given as a datelifeQuery object.
.get_ott_lineage

Get the lineage of a set of taxa. .get_ott_lineage uses rotl::taxonomy_taxon_info() with include_lineage = TRUE.
datelife_result_MRCA

Get a numeric vector of MRCAs from a datelifeResult object. Used in summarize_datelife_result().
congruify_and_mrca_phylo

Congruify nodes of a tree topology to nodes from a source chronogram, and find the mrca nodes
datelife_result_variance_matrix

Compute a variance matrix of a datelifeResult object.
datelife_result_study_index

Find the index of relevant studies in a cached chronogram database.
get_biggest_multiphylo

Get the tree with the most tips from a multiPhylo object: the biggest tree.
get_bold_data

Get genetic data from the Barcode of Life Database (BOLD) for a set of taxon names.
extract_calibrations_dateliferesult

Use congruification to extract secondary calibrations from a datelifeResult object.
datelife_result_sdm_matrix

Go from a datelifeResult object to a Super Distance Matrix (SDM) using weighting = "flat"
datelife_result_sdm_phylo

Reconstruct a supertree from a datelifeResult object using the Super Distance Matrix (SDM) method.
datelife_result_median

Get a median summary chronogram from a datelifeResult object.
datelife_result_median_matrix

Compute a median matrix of a datelifeResult object.
extract_ott_ids

Extract numeric OTT ids from a character vector that combines taxon names and OTT ids.
datelife_use

Generate one or multiple chronograms for a set of given taxon names.
get_all_calibrations

Get secondary calibrations from a chronogram database for a set of given taxon names
datelife_search

Get scientific, peer-reviewed information on time of lineage divergence openly available for a given set of taxon names
force_ultrametric

Force a non-ultrametric phylo object to be ultrametric with phytools::force.ultrametric().
get_all_descendant_species

Quickly get all species belonging to a taxon from the Open Tree of Life Taxonomy (OTT)
felid_gdr_phylo_all

datelifeSummary of a datelifeResult object of all Felidae species.
felid_sdm

SDM tree of a datelifeResult object of all Felidae species.
extract_calibrations_phylo

Use congruification to extract secondary calibrations from a phylo or multiPhylo object with branch lengths proportional to time.
get_calibrations_vector

Search and extract secondary calibrations for a given character vector of taxon names
get_ott_lineage

Get the Open Tree of Life Taxonomic identifier (OTT id) and name of all lineages from one or more input taxa.
get_ott_clade

Get the Open Tree of Life Taxonomic identifiers (OTT ids) and name of one or several given taxonomic ranks from one or more input taxa.
get_calibrations_datelifequery

Search and extract available secondary calibrations for taxon names in a given datelifeQuery object
get_tnrs_names

Process a character vector of taxon names with TNRS
get_goodmatrices

Get indices of good matrices to apply Super Distance Matrix (SDM) method with make_sdm().
filter_for_grove

Filter a datelifeResult object to find the largest grove.
get_valid_children

Extract valid children from given taxonomic name(s) or Open Tree of Life Taxonomic identifiers (OTT ids) from a taxonomic source.
get_mrbayes_node_constraints

Makes a block of node constraints and node calibrations for a MrBayes run file from a list of taxa and ages, or from a dated tree
get_best_grove

Get grove from a datelifeResult object that can be converted to phylo from a median summary matrix
get_ott_children

Use this instead of rotl::tol_subtree() when taxa are not in synthesis tree and you still need to get all species or an induced OpenTree subtree
get_otol_synthetic_tree

Get an Open Tree of Life synthetic subtree of a set of given taxon names.
get_dated_otol_induced_subtree

Get a dated OpenTree induced synthetic subtree from a set of given taxon names, from blackrim's FePhyFoFum service.
get_datelife_result

Get a patristic matrix of time of lineage divergence data for a given set of taxon names
get_taxon_summary

Get a taxon summary of a datelifeResult object.
is_n_overlap

Function for computing n-overlap for two vectors of names (ie., phy1$tip.label, phy2$tip.label) and seeing if they have n overlap
make_all_associations

Find all authors and where they have deposited their trees
get_subset_array_dispatch

Figure out which subset function to use.
get_opentree_chronograms

Get all chronograms from Open Tree of Life database
is_datelife_result_empty

Check if we obtained an empty search with the given taxon name(s).
match_all_calibrations

Match calibrations to nodes of a given tree
patristic_matrix_array_phylo_congruify

Congruify a patristic matrix array from a given phylo object.
map_nodes_ott

Add Open Tree of Life Taxonomy to tree nodes.
get_opentree_species

Get all species belonging to a taxon from the Open Tree of Life Taxonomy (OTT)
patristic_matrix_array_split

Split a patristic matrix array Used inside: patristic_matrix_array_congruify
make_datelife_query2

Go from taxon names to a datelifeQuery object
make_mrbayes_runfile

Make a mrBayes run block file with a constraint topology and a set of node calibrations and missing taxa
make_treebase_associations

Associate TreeBase authors with studies
make_treebase_cache

Create a cache from TreeBase
make_mrbayes_tree

Take a constraint tree and use mrBayes to get node ages and branch lengths given a set of node calibrations without any data.
message_multiphylo

Message for a multiPhylo input
is_good_chronogram

Check if a tree is a valid chronogram.
make_otol_associations

Associate Open Tree of Life authors with studies
make_datelife_query

Go from taxon names to a datelifeQuery object
patristic_matrix_MRCA

Get time of MRCA from patristic matrix. Used in datelife_result_MRCA().
make_contributor_cache

Create a cache from Open Tree of Life
matrix_to_table

Go from a patristic distance matrix to a node ages table
matrices_to_table

Go from a list of patristic distance matrix to a table of node ages
patristic_matrix_array_subset

Subset a patristic matrix array
patristic_matrix_array_subset_both

Are all desired taxa in the patristic matrix array?
phylo_get_subset_array

Get a subset array from a phylo object
get_datelife_result_datelifequery

Get a list of patristic matrices from a given datelifeQuery object
patristic_matrix_unpad

Function to remove missing taxa from a datelifeResult object.
patristic_matrix_to_phylo

Convert a patristic matrix to a phylo object.
phylo_get_subset_array_congruify

Get a congruified subset array from a phylo object
missing_taxa_check

Checks that missing_taxa argument is ok to be used by make_mrbayes_runfile inside tree_add_dates functions.
patristic_matrix_array_congruify

patristic_matrix_array_congruify is used for patristic_matrix_array_subset_both and patristic_matrix_array_congruify.
phylo_tiplabel_underscore_to_space

Convert underscores to spaces in trees.
patristic_matrix_taxa_all_matching

Are all desired taxa in the patristic matrix?
run

Core function to generate results
patristic_matrix_to_newick

Convert patristic matrix to a newick string. Used inside: summarize_datelife_result.
summarize_congruifiedCalibrations

Get summary statistics of ages in a congruifiedCalibrations object.
run_mrbayes

Runs MrBayes from R
summarize_datelife_result

Summarize a datelifeResult object.
patristic_matrix_pad

Fill in empty cells in a patristic matrix for missing taxa.
patristic_matrix_name_reorder

Reorder a matrix so that row and column labels are in alphabetical order.
pick_grove

Pick a grove in the case of multiple groves in a set of trees.
plant_bold_otol_tree

Some plants chronogram
phylo_congruify

Congruify a reference tree and a target tree given as phylo objects.
phylo_check

Checks if phy is a phylo object and/or a chronogram.
make_bladj_tree

Use the BLADJ algorithm to get a chronogram from a tree topology for which you have age data for some of its nodes.
get_fossil_range

Get the ages for a taxon from PBDB
make_bold_otol_tree

Use genetic data from the Barcode of Life Database (BOLD) to reconstruct branch lengths on a tree.
is_datelife_query

Check if input is a datelifeQuery object
input_process

Process a phylo object or a character string to determine if it's correct newick
phylo_has_brlen

Check if a tree has branch lengths
subset2_taxa

Long list of >2.7k virus, bacteria, plant and animal taxon names
results_list_process

Take results_list and process it.
subset2_search

A list with datelifeQuery and datelifeResult objects from a search of taxon names from subset2_taxa
phylo_to_patristic_matrix

Get a patristic matrix from a phylo object.
relevant_curators_tabulate

Return the relevant curators for a set of studies.
tree_from_taxonomy

Gets a taxonomic tree from a vector of taxa
make_overlap_table

Create an overlap table
tree_get_node_data

Get node numbers, node names, descendant tip numbers and labels of nodes from any tree, and node ages from dated trees.
phylo_prune_missing_taxa

Prune missing taxa from a phylo object Used inside phylo_get_subset_array and phylo_get_subset_array_congruify.
make_sdm

Make a Super Distance Matrix (SDM) from a list of good matrices obtained with get_goodmatrices()
opentree_chronograms

Chronogram database
summary.matchedCalibrations

Summarize a matchedCalibrations object summary.matchedCalibrations gets the node age distribution from a matchedCalibrations object.
mrca_calibrations

Identify nodes of a tree topology that are most recent common ancestor (mrca) of taxon pairs from a calibrations object
summary.datelifeResult

Summarize a datelifeResult object.
patristic_matrix_list_to_array

Convert list of patristic matrices to a 3D array.
summary_matrix_to_phylo_all

Get minimum, median, mean, midpoint, and maximum summary chronograms from a summary matrix of a datelifeResult object.
summary_matrix_to_phylo

Go from a summary matrix to an ultrametric phylo object.
phylo_tiplabel_space_to_underscore

Convert spaces to underscores in trees.
phylo_subset_both

Subset a reference and a target tree given as phylo objects.
patristic_matrix_name_order_test

Test the name order of a patristic matrix so that row and column labels are in alphabetical order.
phylo_get_node_numbers

Gets node numbers from any phylogeny
phylo_generate_uncertainty

Generate uncertainty in branch lengths using a lognormal.
problems

Problematic chronograms from Open Tree of Life.
tree_check

Checks if a tree is a phylo class object otherwise it uses input_process. Additionally it can check if tree is a chronogram with phylo_check
tree_fix_brlen

Take a tree with branch lengths and fix negative or zero length branches.
use_calibrations_bladj.matchedCalibrations

Use calibrations to date a topology with the BLADJ algorithm.
tnrs_match

Taxon name resolution service (tnrs) applied to a vector of names by batches
tree_add_dates

Add missing taxa to a dated tree and fabricate node ages for these missing taxa.
some_ants_datelife_result

datelifeResult object of some ants
sample_trees

Sample trees from a file containing multiple trees. Usually from a bayesian analysis output trees file.
update_all_cached

Update all data files as data objects for the package
threebirds_dr

datelifeResult object of three birds "Rhea americana", "Pterocnemia pennata", and "Struthio camelus"
use_calibrations

Date a given tree topology using a combined set of given calibrations
summary_patristic_matrix_array

Summarize patristic matrix array (by default, median). Used inside: summarize_datelife_result.
treebase_cache

Information on contributors, authors, study ids and clades from studies with chronograms in Open tree of Life
use_calibrations_treePL

Date a tree with initial branch lengths with treePL.
use_calibrations_each

Date a given tree topology by using a given list of calibrations independently, to generate multiple hypothesis of time of divergence
use_calibrations_pathd8

Date a tree with secondary calibrations using PATHd8
tree_get_singleton_outgroup

Identify the presence of a single lineage outgroup in a phylogeny
tree_node_tips

To get tip numbers descending from any given node of a tree
recover_mrcaott

Get an mrcaott tag from an OpenTree induced synthetic tree and get its name and ott id
summarize_fossil_range

Summarize taxon age from PBDB to just a single min and max age
use_calibrations_bladj

Use calibrations to date a topology with the BLADJ algorithm.
summarize_summary_matrix

Gets all ages per taxon pair from a distance matrix Internal function used in summary_matrix_to_phylo_all().
update_datelife_cache

Create an updated OpenTree chronograms database object
tree_add_outgroup

Function to add an outgroup to any phylogeny, in phylo or newick format
tree_add_nodelabels

Adds labels to nodes with no assigned label
use_all_calibrations

Date a given tree topology using a given set of congruified calibrations or ages
build_grove_matrix

Find the grove for a group of chronograms and build a matrix.
clean_tnrs

Eliminates unmatched (NAs) and invalid taxa from a rotl::tnrs_match_names() or tnrs_match() output Useful to get ott ids to retrieve an induced synthetic Open Tree of Life. Needed because using include_suppressed = FALSE in rotl::tnrs_match_names() does not drop all invalid taxa.
check_ott_input

Check input for usage in other datelife functions
check_conflicting_calibrations

Check for conflicting calibrations.
birds_and_cats

A multiPhylo object with trees resulting from a datelife search of some birds and cats species
build_grove_list

Build grove list
choose_cluster

Choose an ultrametric phylo object from cluster_patristicmatrix() obtained with a particular clustering method, or the next best tree. If there are no ultrametric trees, it does not force them to be ultrametric.
classification_paths_from_taxonomy

Gets classification paths for a vector of taxa
datelife_authors_tabulate

Return the relevant authors for a set of studies.