Biostrings (version 2.40.2)

DNAString-class: DNAString objects


A DNAString object allows efficient storage and manipulation of a long DNA sequence.


The DNA alphabet

This alphabet contains all letters from the IUPAC Extended Genetic Alphabet (see ?IUPAC_CODE_MAP) plus "-" (the gap letter), "+" (the hard masking letter), and "." (the not a letter or not available letter). It is stored in the DNA_ALPHABET predefined constant (character vector). The alphabet() function returns DNA_ALPHABET when applied to a DNAString object.

Constructor-like functions and generics

In the code snippet below, x can be a single string (character vector of length 1), a BString object or an RNAString object.
DNAString(x="", start=1, nchar=NA): Tries to convert x into a DNAString object by reading nchar letters starting at position start in x.

Accessor methods

In the code snippet below, x is a DNAString object.
alphabet(x, baseOnly=FALSE): If x is a DNAString object, then return the DNA alphabet (see above). See the corresponding man pages when x is a BString, RNAString or AAString object.


The DNAString class is a direct XString subclass (with no additional slot). Therefore all functions and methods described in the XString man page also work with a DNAString object (inheritance).

Unlike the BString container that allows storage of any single string (based on a single-byte character set) the DNAString container can only store a string based on the DNA alphabet (see below). In addition, the letters stored in a DNAString object are encoded in a way that optimizes fast search algorithms.

See Also

IUPAC_CODE_MAP, letter, XString-class, RNAString-class, reverseComplement, alphabetFrequency


d <- DNAString("TTGAAAA-CTC-N")
alphabet(d)                 # DNA_ALPHABET
alphabet(d, baseOnly=TRUE)  # DNA_BASES