
getPESizes(bam.file, param=readParam(pe="both"))
readParam
object containing read extraction parametersdedup
, minq
, restrict
or discard
in readParam
.
Otherwise, the alignment is not considered to be reliable.
Any read pair with exactly one unmapped read is discarded, and the number of read pairs lost in this manner is recorded.
Obviously, read pairs with both reads unmapped will be ignored completely.Of the mapped pairs, the valid (i.e., proper) read pairs are identified. These refer to intrachromosomal read pairs where the reads with the lower and higher genomic coordinates map to the forward and reverse strand, respectively. The distance between the positions of the mapped 5' ends of the two reads must also be equal to or greater than the read lengths. Any intrachromosomal read pair that fails these criteria will be considered as improperly oriented. If the reads are on different chromosomes, the read pair will be recorded as being interchromosomal.
Each valid read pair corresponds to a DNA fragment where both ends are sequenced. The size of the fragment can be determined by calculating the distance between the 5' ends of the mapped reads. The distribution of sizes is useful for assessing the quality of the library preparation, along with all of the recorded diagnostics.
readParam
bamFile <- system.file("exdata", "pet.bam", package="csaw")
out <- getPESizes(bamFile, param=readParam(pe="both"))
out <- getPESizes(bamFile, param=readParam(pe="both", restrict="chrA"))
out <- getPESizes(bamFile, param=readParam(pe="both", discard=GRanges("chrA", IRanges(1, 50))))
Run the code above in your browser using DataLab