BSgenome (version 1.40.1)

BSgenome-utils: BSgenome utilities

Description

Utilities for BSgenome objects.

Usage

"matchPWM"(pwm, subject, min.score = "80%", exclude = "", maskList = logical(0)) "countPWM"(pwm, subject, min.score = "80%", exclude = "", maskList = logical(0)) "vmatchPattern"(pattern, subject, max.mismatch = 0, min.mismatch = 0, with.indels = FALSE, fixed = TRUE, algorithm = "auto", exclude = "", maskList = logical(0), userMask = RangesList(), invertUserMask = FALSE) "vcountPattern"(pattern, subject, max.mismatch = 0, min.mismatch = 0, with.indels = FALSE, fixed = TRUE, algorithm = "auto", exclude = "", maskList = logical(0), userMask = RangesList(), invertUserMask = FALSE) "vmatchPDict"(pdict, subject, max.mismatch = 0, min.mismatch = 0, fixed = TRUE, algorithm = "auto", verbose = FALSE, exclude = "", maskList = logical(0)) "vcountPDict"(pdict, subject, max.mismatch = 0, min.mismatch = 0, fixed = TRUE, algorithm = "auto", collapse = FALSE, weight = 1L, verbose = FALSE, exclude = "", maskList = logical(0))

Arguments

pwm
A numeric matrix with row names A, C, G and T representing a Position Weight Matrix.
pattern
A DNAString object containing the pattern sequence.
pdict
A DNAStringSet object containing the pattern sequences.
subject
A BSgenome object containing the subject sequences.
min.score
The minimum score for counting a match. Can be given as a character string containing a percentage (e.g. "85%") of the highest possible score or as a single number.
max.mismatch, min.mismatch
The maximum and minimum number of mismatching letters allowed (see ?`lowlevel-matching` for the details). If non-zero, an inexact matching algorithm is used.
with.indels
If TRUE then indels are allowed. In that case, min.mismatch must be 0 and max.mismatch is interpreted as the maximum "edit distance" allowed between any pattern and any of its matches (see ?`matchPattern` for the details).
fixed
If FALSE then IUPAC extended letters are interpreted as ambiguities (see ?`lowlevel-matching` for the details).
algorithm
For vmatchPattern and vcountPattern one of the following: "auto", "naive-exact", "naive-inexact", "boyer-moore", "shift-or", or "indels".

For vmatchPDict and vcountPDict one of the following: "auto", "naive-exact", "naive-inexact", "boyer-moore", or "shift-or".

collapse, weight
ignored arguments.
verbose
TRUE or FALSE.
exclude
A character vector with strings that will be used to filter out chromosomes whose names match these strings.
maskList
A named logical vector of maskStates preferred when used with a BSGenome object. When using the bsapply function, the masks will be set to the states in this vector.
userMask
A RangesList, containing a mask to be applied to each chromosome. See bsapply.
invertUserMask
Whether the userMask should be inverted.

Value

A GRanges object for matchPWM with two elementMetadata columns: "score" (numeric), and "string" (DNAStringSet).A GRanges object for vmatchPattern.A GRanges object for vmatchPDict with one elementMetadata column: "index", which represents a mapping to a position in the original pattern dictionary.A data.frame object for countPWM and vcountPattern with three columns: "seqname" (factor), "strand" (factor), and "count" (integer).A DataFrame object for vcountPDict with four columns: "seqname" ('factor' Rle), "strand" ('factor' Rle), "index" (integer) and "count" ('integer' Rle). As with vmatchPDict the index column represents a mapping to a position in the original pattern dictionary.

See Also

matchPWM, matchPattern, matchPDict, bsapply

Examples

Run this code
  library(BSgenome.Celegans.UCSC.ce2)
  data(HNF4alpha)

  pwm <- PWM(HNF4alpha)
  matchPWM(pwm, Celegans)
  countPWM(pwm, Celegans)

  pattern <- consensusString(HNF4alpha)
  vmatchPattern(pattern, Celegans, fixed = "subject")
  vcountPattern(pattern, Celegans, fixed = "subject")

  vmatchPDict(HNF4alpha[1:10], Celegans)
  vcountPDict(HNF4alpha[1:10], Celegans)

Run the code above in your browser using DataCamp Workspace