ncbi_taxon_sample

0th

Percentile

Download representative sequences for a taxon

Downloads a sample of sequences meant to evenly capture the diversity of a given taxon. Can be used to get a shallow sampling of a vast groups. CAUTION: This function can make MANY queries to Genbank depending on arguments given and can take a very long time. Choose your arguments carefully to avoid long waits and needlessly stressing NCBI's servers. Use a downloaded database and extract_taxonomy when possible.

Keywords
internal
Usage
ncbi_taxon_sample(name = NULL, id = NULL, target_rank, min_counts = NULL, max_counts = NULL, interpolate_min = TRUE, interpolate_max = TRUE, min_length = 1, max_length = 10000, min_children = NULL, max_children = NULL, verbose = TRUE, ...)
Arguments
name
(character of length 1) The taxon to download a sample of sequences for.
id
(character of length 1) The taxon id to download a sample of sequences for.
target_rank
(character of length 1) The finest taxonomic rank at which to sample. The finest rank at which replication occurs. Must be a finer rank than taxon. Use get_taxonomy_levels to see available ranks.
min_counts
(named numeric) The minimum number of sequences to download for each taxonomic rank. The names correspond to taxonomic ranks.
max_counts
(named numeric) The maximum number of sequences to download for each taxonomic rank. The names correspond to taxonomic ranks.
interpolate_min
(logical) If TRUE, values supplied to min_counts and min_children will be used to infer the values of intermediate ranks not specified. Linear interpolation between values of spcified ranks will be used to determine values of unspecified ranks.
interpolate_max
(logical) If TRUE, values supplied to max_counts and max_children will be used to infer the values of intermediate ranks not specified. Linear interpolation between values of spcified ranks will be used to determine values of unspecified ranks.
min_length
(numeric of length 1) The minimum length of sequences that will be returned.
max_length
(numeric of length 1) The maximum length of sequences that will be returned.
min_children
(named numeric) The minimum number sub-taxa of taxa for a given rank must have for its sequences to be searched. The names correspond to taxonomic ranks.
max_children
(named numeric) The maximum number sub-taxa of taxa for a given rank must have for its sequences to be searched. The names correspond to taxonomic ranks.
verbose
(logical) If TRUE, progress messages will be printed.
...
Additional arguments are passed to ncbi_searcher.
Details

See get_taxonomy_levels for available taxonomic ranks.

Aliases
  • ncbi_taxon_sample
Examples
## Not run: 
# ncbi_taxon_sample(name = "oomycetes", target_rank = "genus")
# data <- ncbi_taxon_sample(name = "fungi", target_rank = "phylum", 
#                           max_counts = c(phylum = 30), 
#                           entrez_query = "18S[All Fields] AND 28S[All Fields]",
#                           min_length = 600, max_length = 10000)
# ## End(Not run)

Documentation reproduced from package metacoder, version 0.1.2, License: GPL-2 | GPL-3

Community examples

Looks like there are no examples yet.