Assign taxonomy to every line in a BLAST output using the information provided in the name of the subject sequences (stitle)
assign.whole.taxo(BLAST)
a data.frame contining all the information provided in the input data.frame and seven additional columns containing the name of kingdom, phylum, class, order, family, genus, and species for this sequence
data.frame containing the output of a BLAST analysis. The first column must be the name of the sequences matching the queries and must contain information on the taxonomy of the subject sequences. See details.
A. J. Muñoz-Pajares
The expected input data.frame must contain information about taxonomy in the first column. Additional information is accepted if separated by "|", but taxonomy must be the last bit of information. Taxonomical information must be provided for kingdom, phylum, class, order, family, genus, and species, each separated by ";" and identified by a letter as follows:
optionalTEXT | optionalTEXT | k__kingdomName;p__phylumName;c__className; o__orderName;f__familyName;g__genusName;s__speciesName
This is the typical format of sequence names in several databases. Thus a BLAST output using any of these databases will automatically produce the desired format.
filter.whole.taxo, and get.majority.taxo
# data(ex_BLAST)
# TAXO <- assign.whole.taxo(ex_BLAST)
Run the code above in your browser using DataLab