Learn R Programming

TailRank (version 3.2.4)

tailRankPower: Power of the tail-rank test

Description

Compute the significance level and the power of a tail-rank test.

Usage

tailRankPower(G, N1, N2, psi, phi, conf = 0.95,
              model=c("bb", "betabinom", "binomial"))
tailRankCutoff(G, N1, N2, psi, conf,
               model=c("bb", "betabinom", "binomial"),
               method=c('approx', 'exact'))

Value

tailRankCutoff returns an integer that is the maximum expected value of the tail rank statistic under the null hypothesis.

tailRankPower returns a real numbe between 0 and 1 that is the power of the tail-rank test to detect a marker with true sensitivity equal to \(phi\).

Arguments

G

An integer; the number of genes being assessed as potnetial biomarkers. Statistically, the number of hypotheses being tested.

N1

An integer; the number of "train" or "healthy" samples used.

N2

An integer; the number of "test" or "cancer" samples used.

psi

A real number between 0 and 1; the desired specificity of the test.

phi

A real number between 0 and 1; the sensitivity that one would like to be able to detect, conditional on the specificity.

conf

A real number between 0 and 1; the confidence level of the results. Can be obtained by subtracting the family-wise Type I error from 1.

model

A character string that determines whether significance and power are computed based on a binomial or a beta-binomial (bb) model.

method

A character string; either "exact" or "approx". The deafult is to use a Bonferroni approximation.

Author

Kevin R. Coombes <krc@silicovore.com>

Details

A power estimate for the tail-rank test can be obtained as follows. First, let X ~ Binom(N,p) denote a binomial random variable. Under the null hypotheis that cancer is not different from normal, we let \(p = 1 - \psi\) be the expected proportion of successes in a test of whether the value exceeds the psi-th quantile. Now let $$\alpha = P(X > x,| N, p)$$ be one such binomial measurement. When we make \(G\) independent binomial measurements, we take $$conf = P(all\ G\ of\ the\ X's \le x | N, p).$$ (In our paper on the tail-rank statistic, we write everything in terms of \(\gamma = 1 - conf\).) Then we have $$conf = P(X \le x | N, p)^G = (1 - alpha)^G.$$ Using a Bonferroni-like approximation, we can take $$conf ~= 1 - \alpha*G.$$ Solving for \(\alpha\), we find that $$\alpha ~= (1-conf)/G.$$ So, the cutoff that ensures that in multiple experiments, each looking at \(G\) genes in \(N\) samples, we have confidence level \(conf\) (or significance level \(\gamma = 1 - conf\)) of no false positives is computed by the function tailRankCutoff.

The final point to note is that the quantiles are also defined in terms of \(q = 1 - \alpha\), so there are lots of disfiguring "1's" in the implementation.

Now we set \(M\) to be the significance cutoff using the procedure detailed above. A gene with sensitivity \(\phi\) gets detected if the observed number of cases above the threshold is greater than or equal to \(M\). The tailRankPower function implements formula (1.3) of our paper on the tail-rank test.

See Also

TailRankTest, tailRankPower, biomarkerPowerTable, matrixMean, toleranceBound

Examples

Run this code
psi.0 <- 0.99
confide <- rev(c(0.8, 0.95, 0.99))
nh <- 20
ng <- c(100, 1000, 10000, 100000)
ns <- c(10, 20, 50, 100, 250, 500)
formal.cut <- array(0, c(length(ns), length(ng), length(confide)))
for (i in 1:length(ng)) {
  for (j in 1:length(ns)) {
    formal.cut[j, i, ] <- tailRankCutoff(ng[i], nh, ns[j], psi.0, confide)
  }
}
dimnames(formal.cut) <- list(ns, ng, confide)
formal.cut

phi <- seq(0.1, 0.7, by=0.1)
N <- c(10, 20, 50, 100, 250, 500)
pows <- matrix(0, ncol=length(phi), nrow=length(N))
for (ph in 1:length(phi)) {
  pows[, ph] <-  tailRankPower(10000, nh, N, 0.95, phi[ph], 0.9)
}
pows <- data.frame(pows)
dimnames(pows) <- list(as.character(N), as.character(round(100*phi)))
pows

Run the code above in your browser using DataLab