The functions generates three files.
- A fasta files containing the sequences with their IDs.
This file must be imported as a DNAStringSet
to be used with DECIPHER, using eg:
Biostrings::readDNAStringSet("ex_seqs.fasta")
- A text files containing the sequence taxonomic assignment.
This file must be imported as a character vector
to be used with DECIPHER, using eg:
readr::read_lines("ex_taxo.txt")
- A text file ("taxid") containing the taxonomic ranks
associated with each taxon. This is an asterisk delimited file
which must be imported as a dataframe (see LearnTaxa), using eg:
readr::read_delim("ex_ranks.txt",
col_names = c('Index', 'Name', 'Parent', 'Level', 'Rank'),
delim = "*", quote = "")
The taxid file can be very slow to write for large datasets.
Therefore it is not generated by default.