taxize (version 0.1.5)

get_seqs: Retrieve nucleotide sequences from NCBI.

Description

This function retrieves one sequences for each species, picking the longest available for the given gene.

Usage

get_seqs(taxon_name, gene, seqrange, getrelated,
    writetodf = TRUE, filetowriteto)

Arguments

taxon_name
Scientific name to search for (character).
gene
Gene (character) or genes (character vector) to search for.
seqrange
Sequence range, as e.g., "1:1000" (character).
getrelated
Logical, if TRUE, gets the longest sequences of a species in the same genus as the one searched for. If FALSE, get's nothing.
writetodf
Write resulting data.frame of results to a file on your machine (logical).
filetowriteto
If writetodf=TRUE, then specify the file name. Default=T.

Value

  • Data.frame of results.

Details

Removes predicted sequences so you don't have to remove them. Predicted sequences are those with accession numbers that have "XM_" or "XR_" prefixes.

Examples

Run this code
# A single species
get_seqs(taxon_name="Acipenser brevirostrum", gene = c("coi", "co1"),
		seqrange = "1:3000", getrelated=T, writetodf=F)

# Many species, can run in parallel or not using plyr
species <- c("Colletes similis","Halictus ligatus","Perdita trisignata")
llply(species, get_seqs, gene = c("coi", "co1"), seqrange = "1:2000",
   getrelated=T, writetodf=F)

Run the code above in your browser using DataLab