In (single-decoy) TDC, each hypothesis is associated to a
winning score and a label (1 for a target win, -1 for a decoy win). This
function assumes that the hypotheses are ordered in decreasing order of
winning scores (with ties broken at random). The argument labels,
therefore, must be ordered according to this rule.
This function also supports the extension of TDC that uses multiple
decoys. In that setup, the target score is competed with multiple decoy
scores and the rank of the target score after competition is used to determine whether the
hypothesis is a target win (label = 1), decoy win (-1) or uncounted (0).
The top c proportion of ranks are considered winning, the bottom
1-lambda losing, and all the rest uncounted.
The threshold of TDC is given by the formula:
$$\max\{k : \frac{D_k + 1}{T_k \vee 1} \cdot \frac{c}{1-\lambda} \leq \alpha\}$$
where \(T_k\) is the number of target wins among the top
\(k\) hypotheses, and \(D_k\) is the number of decoy wins similarly.
The argument gamma sets a confidence level of 1-gamma. Since
the standardized band requires pre-computed Monte Carlo quantiles, only
certain values of gamma are available to use. Commonly used
confidence levels, like 0.95 and 0.99, are available. We refer the reader
to the README of this package for more details.
The argument alpha, used to compute the threshold of TDC, is also
used in this function. It serves to compute an appropriate d_max
for a non-trivial bound. In particular, if the user inputs a vector of
thresholds, a bound is returned for each element of
thresholds using the same d_max. For more details, see:
https://arxiv.org/abs/2302.11837.
We recommend the use of interpolate = TRUE (default), as it generally
results in a tighter bound. This comes at the cost of performance: the bound
for each threshold is computed in O(n) time with interpolation and O(1)
without.