Learn R Programming

taxize (version 0.4.0)

ncbi_getbyname: Retrieve gene sequences from NCBI by taxon name and gene names.

Description

Retrieve gene sequences from NCBI by taxon name and gene names.

Usage

ncbi_getbyname(taxa, gene = "COI", seqrange = "1:3000",
  getrelated = FALSE, verbose = TRUE)

Arguments

taxa
(character) Scientific name to search for.
seqrange
(character) Sequence range, as e.g., "1:1000". This is the range of sequence lengths to search for. So "1:1000" means search for sequences from 1 to 1000 characters in length.
getrelated
(logical) If TRUE, gets the longest sequences of a species in the same genus as the one searched for. If FALSE, returns nothing if no match found.
verbose
(logical) If TRUE (default), informative messages printed.
gene
(character) Gene or genes (in a vector) to search for. See examples.

Value

  • Data.frame of results.

Details

Removes predicted sequences so you don't have to remove them. Predicted sequences are those with accession numbers that have "XM_" or "XR_" prefixes. This function retrieves one sequences for each species, picking the longest available for the given gene.

See Also

ncbi_search, ncbi_getbyid

Examples

Run this code
# A single species
ncbi_getbyname(taxa="Acipenser brevirostrum")

# Many species
species <- c("Colletes similis","Halictus ligatus","Perdita trisignata")
ncbi_getbyname(taxa=species, gene = c("coi", "co1"), seqrange = "1:2000")

Run the code above in your browser using DataLab