startScore: Score potential protein binding sites
Description
For each position in the genome this function computes a score indicating the
likelihood that a protein binding site starts at that position.
Usage
startScore(data, b, support, background, bgCutoff, supCutoff)
Arguments
data
A two column matrix with read counts. The two columns correspond to reads on the
forward and reverse strand respectively.
b
Length of binding region.
support
Length of support region.
background
Length of background window.
bgCutoff
Cutoff for the change in read rates between adjacent windows (see Details).
supCutoff
Cutoff for the change in read rates between support regions on forward and
reverse strand (see Details).
Value
Numeric vector with binding site scores.
Details
Robust estimates of read rates in background windows and support regions are obtained by limiting the
difference between related estimates. Consider a forward support region of length 10 containing 20 reads.
The maximum likelihood estimate for the rate parameter of the (assumed) underlying Poisson distribution is
$lambda_hat = 20/10 = 2$. If there are 50 reads in the reverse
support region a robust estimate of the rate parameter is obtained as
max(50/10, qpois(supCutoff, lambda=lambda_hat))