These generic functions remove leading or trailing nucleotides or
qualities. trimTails and trimTailw remove low-quality
reads from the right end using a sliding window (trimTailw) or
a tally of (successive) nucleotides falling at or below a quality
threshold (trimTails). trimEnds takes an alphabet of
characters to remove from either left or right end.
## S4 methods for 'ShortReadQ', 'FastqQuality', or 'SFastqQuality'
trimTailw(object, k, a, halfwidth, ..., ranges=FALSE)
trimTails(object, k, a, successive=FALSE, ..., ranges=FALSE)
trimEnds(object, a, left=TRUE, right=TRUE, relation=c("<=", "="=")," ...,="" ranges="FALSE)
"trimTailw"(object, k, a, halfwidth, ..., alphabet, ranges=FALSE)
"trimTails"(object, k, a, successive=FALSE, ..., alphabet, ranges=FALSE)
"trimTailw"(object, k, a, halfwidth, ..., destinations, ranges=FALSE)
"trimTails"(object, k, a, successive=FALSE, ..., destinations, ranges=FALSE)
"trimEnds"(object, a, left=TRUE, right=TRUE, relation=c("<=", "="=")," ...,="" destinations,="" ranges="FALSE)=",>ShortReadQ and
derived classes; see below to discover these methods) or character
vector of fastq file(s) to be trimmed. integer(1) describing the number of failing
letters required to trigger trimming.trimTails and trimTailw, a
character(1) with nchar(a) == 1L giving the letter at
or below which a nucleotide is marked as failing. For trimEnds a character() with all nchar() ==
1L giving the letter at or below which a nucleotide or quality
scores marked for removal.
logical(1) indicating whether failures can
occur anywhere in the sequence, or must be successive. If
successive=FALSE, then the k'th failed letter and subsequent
are removed. If successive=TRUE, the first succession of k
failed and subsequent letters are removed.logical(1) indicating whether trimming is
from the left or right ends.character(1) selected from the argument values,
i.e., <= or="" ="=" indicating="" whether="" all="" letters="" at="" below="" the="" alphabet(object) are to be removed, or only
exact matches.=>object of type character(), an
equal-length vector of destination files. Files must not already
exist.character() (ordered low to high) letters on
which quality scale is measured. Usually supplied internally (user
does not need to specify). If missing, then set to ASCII characters
0-127.logical(1) indicating whether the trimmed object,
or only the ranges satisfying the trimming condition, be returned.class(object) trimmed to contain only those
nucleotides satisfying the trim criterion or, if ranges=TRUE an
IRanges instance defining the ranges that would trim
object. trimTailw starts at the left-most nucleotide, tabulating the
number of cycles in a window of 2 * halfwidth + 1 surrounding
the current nucleotide with quality scores that fall at or below
a. The read is trimmed at the first nucleotide for which this
number >= k. The quality of the first or last nucleotide is
used to represent portions of the window that extend beyond the
sequence.
trimTails starts at the left-most nucleotide and accumulates
cycles for which the quality score is at or below a. The read
is trimmed at the first location where this number >= k. With
successive=TRUE, failing qualities must occur in strict
succession.
trimEnds examines the left, right, or both ends
of object, marking for removal letters that correspond to
a and relation. The trimEnds,ShortReadQ-method
trims based on quality.
ShortReadQ methods operate on quality scores; use
sread() and the ranges argument to trim based on
nucleotide (see examples).
character methods transform one or several fastq files to new
fastq files, applying trim operations based on quality scores; use
filterFastq with your own filter argument to filter on
nucleotides.
showMethods(trimTails)
sp <- SolexaPath(system.file('extdata', package='ShortRead'))
rfq <- readFastq(analysisPath(sp), pattern="s_1_sequence.txt")
## remove leading / trailing quality scores <= 'I'
trimEnds(rfq, "I")
## remove leading / trailing 'N's
rng <- trimEnds(sread(rfq), "N", relation="==", ranges=TRUE)
narrow(rfq, start(rng), end(rng))
## remove leading / trailing 'G's or 'C's
trimEnds(rfq, c("G", "C"), relation="==")
Run the code above in your browser using DataLab