Learn R Programming

sgof (version 2.1.1)

Extended.SGoF: Extended SGoF multi-testing procedure

Description

Performs the Extended SGoF method (Castro-Conde, Döhler et al., 2014) for multiple hypothesis testing.

Usage

Extended.SGoF(u,pCDFlist=NA, K=NA, alpha = 0.05, gamma = 0.05, method=NA, 
              Discrete=TRUE, Sides=1,...)

Arguments

u
A (non-empty) numeric vector of p-values.
pCDFlist
A (non-empty) list with the empirical cumulative function of each discrete p-value.
K
Numeric value. The number of continuous tests.
alpha
Numeric value. The significance level of the metatest.
gamma
Numeric value. The p-value threshold, so Binomial SGoF looks for significance in the amount of p-values below gamma.
method
Method used in the computation of the Poisson binomial quantile. "DFT-CF" for the exact method and "RNA" for the refined normal approximation.
Discrete
Logical. Default is TRUE. A variable indicating if the tests are discrete or continuous in order to estimate the FDR.
Sides
Numeric value indicating if the tests are one-sided (default), Sides=1, or two-sided, Sides=2 in order to estimate the FDR.
...
Other parameters to be passed through to robust.fdr function.

Value

  • A list containing the following values:
  • RejectionsThe number of effects declared by Extended SGoF.
  • FDRThe estimated false discovery rate.
  • pvaluesThe original p-values.
  • alphaThe specified significance level for the metatest.
  • gammaThe specified p-value threshold.
  • KThe specified number of continuous tests.
  • MethodThe specified method used in the computation of the Poisson binomial quantile.
  • DiscreteThe specified type of tests.
  • SidesNumeric value indicating if the tests are one-sided (default), Sides=1, or two-sided, Sides=2.
  • callThe matched call.

encoding

UTF-8

Details

Extended SGoF is an extension of Binomial SGoF, based on the generalized or Poisson binomial distribution (Hong, 2013a), which takes into account the discreteness of the p-values. If all the tests are continuous Extended SGoF reduces to Binomial SGoF method. In particular, if the p-values are continuous, the number of rejections given by Extended.SGoF will be the number of effects declared by Binomial SGoF plus one (except in case of ties). This is because the two methods use slightly different definitions of quantile function. Whereas Extended SGoF uses the usual definition of the quantile function $F^{-1}$(1-alpha) with F(x)=P(X<=x), in="" carvajal-rodríguez="" et="" al.="" (2009)="" the="" quantile="" of="" binomial="" distribution="" was="" defined="" considering="" f(x)="P(Xpoibin package (Hong, 2013b) is used. The exact method ("DFT-CF") and the "RNA" approximation are used by default to compute the quantile depending on whether the number of tests is smaller than 2000 or not, respectively (see Hong 2013a for more information). However, the user can specified which of the two methods to use. Extended SGoF works the same like Binomial SGoF but it uses the quantiles of the generalized binomial distribution, as mentioned, instead of the binomial quantiles. Extended SGoF maintains the theoretical properties of Binomial SGoF, e.g. weak control of FDR(FWER) and increasing power when the number of tests increases (de Uña Álvarez, 2011). The FDR is estimated by using the method proposed by: Pounds and Cheng (2006) using the robust.fdr function in the prot2D package (Artigaud S., 2013).

References

Artigaud S. (2013). prot2D: Statistical Tools for volume data from 2D Gel Electrophoresis. R package version 1.0.0 Carvajal Rodríguez A, de Uña Álvarez J and Rolán Álvarez E (2009). A new multitest correction (SGoF) that increases its statistical power when increasing the number of tests. BMC Bioinformatics 10:209. Castro-Conde I, Döhler S and de Uña Álvarez J. (2014). An extended SGoF multitesting method for discrete data. Paper in progress. de Uña Álvarez J (2011). On the statistical properties of SGoF multitesting method. Statistical Applications in Genetics and Molecular Biology, Vol. 10, Iss. 1, Article 18. Heller R, Gur H and Yaacoby S. (2012). discreteMTP: Multiple testing procedures for discrete test statistics. R package version 0.1-2 Hong Y. (2013a). On computing the distribution functions for the Poisson binomial distribution. Computational Statistics and Data Analysis 59, 41-51. Hong Y. (2013b). poibin: The Poisson binomial distribution. R package version 1.2 Pounds, S. and C. Cheng (2006). Robust estimation of the false discovery rate. Bioinformatics 22 (16), 1979-1987.

See Also

Binomial.SGoF

Examples

Run this code
#library(discreteMTP)

data(amnesia) #discrete p-values

A11 <- amnesia$AmnesiaCases 
A21 <- sum(amnesia$AllAdverseCases) - A11
A12 <- amnesia$AllAdverseCases - A11
A22 <- sum(amnesia$AllAdverseCases) - sum(amnesia$AmnesiaCases) - A12

A1. <- sum(amnesia$AmnesiaCases)
A2. <- sum(amnesia$AllAdverseCases) - A1.   
n <- A11 + A12
k <- pmin(n,A1.)

pCDFlist <- list()
pvec <- numeric(nrow(amnesia))

## Calculation of the p-values and the p-values CDFs: 

for (i in 1:nrow(amnesia))
{
  x <- 0:k[i]
  pCDFlist[[i]] <- dhyper(x ,A1., A2. ,n[i]) + phyper(x ,A1. ,A2. ,n[i] ,lower.tail = FALSE)
  pCDFlist[[i]] <- rev(pCDFlist[[i]])
  pvec[i] <- dhyper(A11[i] ,A1. ,A2. ,n[i]) + phyper(A11[i] ,A1. ,A2. ,n[i] ,lower.tail = FALSE)
}

res<-Extended.SGoF(u=pvec,pCDFlist=pCDFlist,alpha=0.05,gamma=0.05,Discrete=TRUE,Sides=1)
res

#library(sgof)
#continuous p-values

res2<-Extended.SGoF(u=Hedenfalk$x,K=3170,Discrete=FALSE, method="DFT-CF",Sides=2)
res2

Run the code above in your browser using DataLab