A Fasta
object contains biological sequences in the FASTA format. It is a small (S3)
extension to a data.frame
. It is actually a data.frame
containing at least two text columns
named Header and Sequence. The Header column contains the headerlines for each sequence,
and the Sequence columns the sequences themselves. A Fasta
object is typically created by reading
a FASTA formatted file into R by readFasta
.
A Fasta
object can be treated as a data.frame
, which makes it quick and easy to search both
Header and Sequence for specific regular expressions, sort or re-arrange the ordering of the sequences,
extract subsets or add new data to an existing Fasta
object.
The plot.Fasta
function will display the content of the Fasta
object as a histogram over
the lengths of the sequences.
The summary.Fasta
function will display a text giving the number of sequences and the alphabet,
i.e. listing all unique symbols found in the file.