Biostrings (version 2.40.2)

matchLRPatterns: Find paired matches in a sequence

Description

The matchLRPatterns function finds paired matches in a sequence i.e. matches specified by a left pattern, a right pattern and a maximum distance between the left pattern and the right pattern.

Usage

matchLRPatterns(Lpattern, Rpattern, max.gaplength, subject, max.Lmismatch=0, max.Rmismatch=0, with.Lindels=FALSE, with.Rindels=FALSE, Lfixed=TRUE, Rfixed=TRUE)

Arguments

Lpattern
The left part of the pattern.
Rpattern
The right part of the pattern.
max.gaplength
The max length of the gap in the middle i.e the max distance between the left and right parts of the pattern.
subject
An XString, XStringViews or MaskedXString object containing the target sequence.
max.Lmismatch
The maximum number of mismatching letters allowed in the left part of the pattern. If non-zero, an inexact matching algorithm is used (see the matchPattern function for more information).
max.Rmismatch
Same as max.Lmismatch but for the right part of the pattern.
with.Lindels
If TRUE then indels are allowed in the left part of the pattern. In that case max.Lmismatch is interpreted as the maximum "edit distance" allowed in the left part of the pattern.

See the with.indels argument of the matchPattern function for more information.

with.Rindels
Same as with.Lindels but for the right part of the pattern.
Lfixed
Only with a DNAString or RNAString subject can a Lfixed value other than the default (TRUE) be used.

With Lfixed=FALSE, ambiguities (i.e. letters from the IUPAC Extended Genetic Alphabet (see IUPAC_CODE_MAP) that are not from the base alphabet) in the left pattern \_and\_ in the subject are interpreted as wildcards i.e. they match any letter that they stand for.

Lfixed can also be a character vector, a subset of c("pattern", "subject"). Lfixed=c("pattern", "subject") is equivalent to Lfixed=TRUE (the default). An empty vector is equivalent to Lfixed=FALSE. With Lfixed="subject", ambiguities in the pattern only are interpreted as wildcards. With Lfixed="pattern", ambiguities in the subject only are interpreted as wildcards.

Rfixed
Same as Lfixed but for the right part of the pattern.

Value

An XStringViews object containing all the matches, even when they are overlapping (see the examples below), and where the matches are ordered from left to right (i.e. by ascending starting position).

See Also

matchPattern, matchProbePair, trimLRPatterns, findPalindromes, reverseComplement, XString-class, XStringViews-class, MaskedXString-class

Examples

Run this code
library(BSgenome.Dmelanogaster.UCSC.dm3)
subject <- Dmelanogaster$chr3R
Lpattern <- "AGCTCCGAG"
Rpattern <- "TTGTTCACA"
matchLRPatterns(Lpattern, Rpattern, 500, subject) # 1 match

## Note that matchLRPatterns() will return all matches, even when they are
## overlapping:
subject <- DNAString("AAATTAACCCTT")
matchLRPatterns("AA", "TT", 0, subject) # 1 match
matchLRPatterns("AA", "TT", 1, subject) # 2 matches
matchLRPatterns("AA", "TT", 3, subject) # 3 matches
matchLRPatterns("AA", "TT", 7, subject) # 4 matches

Run the code above in your browser using DataCamp Workspace