
Last chance! 50% off unlimited learning
Sale ends in
nucleR
uses this information when preprocessing single-ended sequences. You can provide this information by your own (usually a 147bp length is a good aproximation) or you can use this method to automatically guess the size of the inserts.
"fragmentLenDetect"(reads, samples=1000, window=1000, min.shift=1, max.shift=100, mc.cores=1, as.shift=FALSE)
"fragmentLenDetect"(reads, samples=1000, window=1000, min.shift=1, max.shift=100, mc.cores=1, as.shift=FALSE)
AlignedRead
or RangedData
format)
as.shift=TRUE
min.shift
to max.shift
. In every step, the correlation on a random position of length window
is checked between both strands. The maximum correlation is returned and averaged for samples
repetitions. The final returned length is the best shift detected plus the width of the reads. You can increase the performance of this function by reducing the samples
value and/or narrowing the shift range. The window
size has almost no impact on the performance, despite a to small value can give biased results.
#Create a sinthetic dataset, simulating single-end reads, for positive and negative strands
pos = syntheticNucMap(nuc.len=40, lin.len=130)$syn.reads #Positive strand reads
neg = IRanges(end=start(pos)+147, width=40) #Negative strand (shifted 147bp)
sim = RangedData(c(pos, neg), strand=c(rep("+", length(pos)), rep("-", length(neg))))
#Detect fragment lenght (we know by construction it is really 147)
fragmentLenDetect(sim, samples=50)
#The function restrict the sampling to speed up the example
Run the code above in your browser using DataLab