Biostrings (version 2.40.2)

AlignedXStringSet-class: AlignedXStringSet and QualityAlignedXStringSet objects

Description

The AlignedXStringSet and QualityAlignedXStringSet classes are containers for storing an aligned XStringSet.

Arguments

Accessor methods

In the code snippets below, x is a AlignedXStringSet or QualityAlignedXStringSet object.
unaligned(x): The original string.
aligned(x, degap = FALSE): If degap = FALSE, the "filled-with-gaps subsequence" representing the aligned substring. If degap = TRUE, the "gap-less subsequence" representing the aligned substring.
start(x): The start of the aligned substring.
end(x): The end of the aligned substring.
width(x): The width of the aligned substring, ignoring gaps.
indel(x): The positions, in the form of an IRanges object, of the insertions or deletions (depending on what x represents).
nindel(x): A two-column matrix containing the length and sum of the widths for each of the elements returned by indel.
length(x): The length of the aligned(x).
nchar(x): The nchar of the aligned(x).
alphabet(x): Equivalent to alphabet(unaligned(x)).
as.character(x): Converts aligned(x) to a character vector.
toString(x): Equivalent to toString(as.character(x)).

Subsetting methods

x[i]: Returns a new AlignedXStringSet or QualityAlignedXStringSet object made of the selected elements.
rep(x, times): Returns a new AlignedXStringSet or QualityAlignedXStringSet object made of the repeated elements.

Details

Before we define the notion of alignment, we introduce the notion of "filled-with-gaps subsequence". A "filled-with-gaps subsequence" of a string string1 is obtained by inserting 0 or any number of gaps in a subsequence of s1. For example L-A--ND and A--N-D are "filled-with-gaps subsequences" of LAND. An alignment between two strings string1 and string2 results in two strings (align1 and align2) that have the same length and are "filled-with-gaps subsequences" of string1 and string2.

For example, this is an alignment between LAND and LEAVES:

    L-A
    LEA
  

An alignment can be seen as a compact representation of one set of basic operations that transforms string1 into align1. There are 3 different kinds of basic operations: "insertions" (gaps in align1), "deletions" (gaps in align2), "replacements". The above alignment represents the following basic operations:

    insert E at pos 2
    insert V at pos 4
    insert E at pos 5
    replace by S at pos 6 (N is replaced by S)
    delete at pos 7 (D is deleted)
  
Note that "insert X at pos i" means that all letters at a position >= i are moved 1 place to the right before X is actually inserted.

There are many possible alignments between two given strings string1 and string2 and a common problem is to find the one (or those ones) with the highest score, i.e. with the lower total cost in terms of basic operations.

See Also

pairwiseAlignment, PairwiseAlignments-class, XStringSet-class

Examples

Run this code
pattern <- AAString("LAND")
subject <- AAString("LEAVES")
nw1 <- pairwiseAlignment(pattern, subject, substitutionMatrix = "BLOSUM50", gapOpening = 3, gapExtension = 1)
alignedPattern <- pattern(nw1)
unaligned(alignedPattern)
aligned(alignedPattern)
as.character(alignedPattern)
nchar(alignedPattern)

Run the code above in your browser using DataCamp Workspace