Downloads orthologous sequences and carries out their alignment
custom.aln(target, species, molecule = 'protein', sfile = FALSE)
the KEGG identifier of the protein of interest.
a character vector containing the KEGG code for the species of interest.
a character string specifying the nature of the sequence; it must be one of 'dna', 'protein'.
logical, if TRUE the alignment in fasta format is saved in the current directory.
Returns a list of class "fasta" with three components: 'ali' (an alignment character matrix with a row per sequence and a column per residue), 'id' (sequence identifiers) and 'call' (the matched call).
We can build the list of species or, alternatively, we can choose between four pre-established options: 'vertebrates', 'plants','one-hundred' and 'two-hundred'. The first will use the following seven species: human (hsa), chimp (ptr), gorilla (ggo), rat (rno), cow (bta), chicken (gga), western clawed frog (xtr) and zebrafish (dre). The second, A. thaliana (ara), A. lyrata (aly), B. oleracea (boe), G. max (gmax), S. lycopersicum (sly), O. sativa (osa) and C. reinhardtii (cre). The third and fourth options will use orthologous sequences from one hundred and two hundred different species, respectively.
Edgar RC. Nucl. Ac. Res. 2004 32:1792-1797.
Edgar RC. BMC Bioinformatics 5(1):113.
msa(), list.hom(), parse.hssp(), get.hssp(), shannon()
# NOT RUN {
custom.aln('hsa:4069', species = c('pps', 'pon', 'mcc', 'ssc'))
# }
# NOT RUN {
custom.aln('cge:100773737', 'vertebrates', molecule = 'dna' )
# }
Run the code above in your browser using DataLab