traits (version 0.4.0)

ncbi_byname: Retrieve gene sequences from NCBI by taxon name and gene names.

Description

Retrieve gene sequences from NCBI by taxon name and gene names.

Usage

ncbi_byname(taxa, gene = "COI", seqrange = "1:3000",
  getrelated = FALSE, verbose = TRUE, ...)

Arguments

taxa

(character) Scientific name to search for.

gene

(character) Gene or genes (in a vector) to search for. See examples.

seqrange

(character) Sequence range, as e.g., "1:1000". This is the range of sequence lengths to search for. So "1:1000" means search for sequences from 1 to 1000 characters in length.

getrelated

(logical) If TRUE, gets the longest sequences of a species in the same genus as the one searched for. If FALSE, returns nothing if no match found.

verbose

(logical) If TRUE (default), informative messages printed.

...

Curl options passed on to GET

Value

Data.frame of results.

Details

Removes predicted sequences so you don't have to remove them. Predicted sequences are those with accession numbers that have "XM_" or "XR_" prefixes. This function retrieves one sequences for each species, picking the longest available for the given gene.

See Also

ncbi_searcher, ncbi_byid

Examples

Run this code
# NOT RUN {
# A single species
ncbi_byname(taxa="Acipenser brevirostrum")

# Many species
species <- c("Colletes similis","Halictus ligatus","Perdita californica")
ncbi_byname(taxa=species, gene = c("coi", "co1"), seqrange = "1:2000")
# }

Run the code above in your browser using DataCamp Workspace