Learn R Programming

treebase (version 0.1.0)

search_treebase: A function to pull in the phyologeny/phylogenies matching a search query

Description

A function to pull in the phyologeny/phylogenies matching a search query

Usage

search_treebase(input, by, returns = c("tree", "matrix"),
  exact_match = FALSE, max_trees = Inf, branch_lengths = FALSE,
  curl = getCurlHandle(), verbose = TRUE, pause1 = 2, pause2 = 1,
  attempts = 3, only_metadata = FALSE)

Arguments

input
a search query (character string)
by
the kind of search; author, taxon, subject, study, etc (see list of possible search terms, details)
returns
should the fn return the tree or the character matrix?
exact_match
force exact matching for author name, taxon, etc. Otherwise does partial matching
max_trees
Upper bound for the number of trees returned, good for keeping possibly large initial queries fast
branch_lengths
logical indicating whether should only return trees that have branch lengths.
curl
the handle to the curl web utility for repeated calls, see the getCurlHandle() function in RCurl package for details.
verbose
logical indicating level of progress reporting
pause1
number of seconds to hesitate between requests
pause2
number of seconds to hesitate between individual files
attempts
number of attempts to access a particular resource
only_metadata
option to only return metadata about matching trees which lists study.id, tree.id, kind (gene,species,barcode) type (single, consensus) number of taxa, and possible quality score.

Value

  • either a list of trees (multiphylo) or a list of character matrices

Details

Choose the search type. Options are:
  • abstract
{ search terms in the publication abstract} author{ match authors in the publication} subject{ match subject} doi{ the unique object identifier for the publication } ncbi{ NCBI identifier number for the taxon} kind.tree{ Kind of tree (Gene tree, species tree, barcode tree) } type.tree{ type of tree (Consensus or Single)} ntax{ number of taxa in the matrix} quality{ A quality score for the tree, if it has been rated. } study{ match words in the title of the study or publication} taxon{ taxon scientific name } id.study{ TreeBASE study ID} id.tree{ TreeBASE's unique tree identifier (Tr.id)} id.taxon{ taxon identifier number from TreeBase } tree{ The title for the tree} type.matrix{ Type of matrix } matrix{ Name given the the matrix } id.matrix{ TreeBASE's unique matrix identifier} nchar{ number of characters in the matrix}

Examples

Run this code
## defaults to return phylogeny
Huelsenbeck <- search_treebase("Huelsenbeck", by="author")

## can ask for character matrices:
wingless <- search_treebase("2907", by="id.matrix", returns="matrix")

## Some nexus matrices don't meet read.nexus.data's strict requirements,
## these aren't returned
H_matrices <- search_treebase("Huelsenbeck", by="author", returns="matrix")

## Use Booleans in search: and, or, not
## Note that by must identify each entry type if a Boolean is given
HR_trees <- search_treebase("Ronquist or Hulesenbeck", by=c("author", "author"))

## We'll often use max_trees in the example so that they run quickly,
## notice the quotes for species.
dolphins <- search_treebase('"Delphinus"', by="taxon", max_trees=5)
## can do exact matches
humans <- search_treebase('"Homo sapiens"', by="taxon", exact_match=TRUE, max_trees=10)
## all trees with 5 taxa
five <- search_treebase(5, by="ntax", max_trees = 10)
## These are different, a tree id isn't a Study id.  we report both
studies <- search_treebase("2377", by="id.study")
tree <- search_treebase("2377", by="id.tree")
c("TreeID" = tree$Tr.id, "StudyID" = tree$S.id)
## Only results with branch lengths
## Has to grab all the trees first, then toss out ones without branch_lengths
Near <- search_treebase("Near", "author", branch_lengths=TRUE)

Run the code above in your browser using DataLab