annotate (version 1.48.0)

blastSequences: Run a blast query to NCBI for either a string or an entrez gene ID and then return a series of MultipleAlignment objects.

Description

This function sends a query to NCBI as a string of sequence or an entrez gene ID and then returns a series of MultipleAlignment objects.

Usage

blastSequences(x, database, hitListSize, filter, expect, program, timeout=40, as=c("DNAMultipleAlignment", "data.frame", "XML"))

Arguments

x
A sequence as a character vector or an integer corresponding to an entrez gene ID. Submit multiple sequences as a length-1 character vector, x = ">ID-1\nACATGCTA\n>ID-2\nAAACCACTT".
database
Which NCBI database to use. If not “blastn”, then set parse.result=FALSE
hitListSize
Number of hits to keep.
filter
Sequence filter; “L” for Low Complexity, “R” for Human Repeats, “m” for Mask lookup
expect
The BLAST ‘expect’ value above which matches will be returned.
program
Which program do you want to use for blast.
timeout
Approximate maximum length of time, in seconds, to wait for a result.
as
character(1) indicating whether the result from the NCBI server should be parsed to a list of DNAMultipleAlignment instances, represented as a data.frame, or returned as XML.

Value

By default, a series of DNAMultipleAlignment (see MultipleAlignment-class objects. Alternatively, a data.frame or XML document returned from the NCBI server. The data.frame is a ‘long form’ representation of the ‘Iteration’, ‘Hit’ and ‘Hsp’ results returned from the server. The XML document is the result of the xmlParse function of the XML library, and follows the format described by http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd and http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.mod.dtd.

Details

Right now the function only works for "blastn".

The NCBI URL api used by this function is documented at http://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html

Examples

Run this code

## x can be an entrez gene ID
blastSequences(17702, timeout=40, as="data.frame")

if (interactive()) {

    ## or x can be a sequence
    blastSequences(x = "GGCCTTCATTTACCCAAAATG")

    ## hitListSize does not promise that you will get the number of
    ## matches you want..  It will just try to get that many.
    blastSequences(x = "GGCCTTCATTTACCCAAAATG", hitListSize="20")

}

Run the code above in your browser using DataCamp Workspace