Learn R Programming

sgof (version 2.0.2)

BBSGoF: BBSGoF multi-testing procedure.

Description

BB-SGoF (de Uña-Álvarez, 2012; Castro-Conde and de Uña-Álvarez, 2013 13/03) is an adaptation of SGoF method for possibly dependent tests. It is initially assumed that the provided vector of p-values are correlated in k blocks of equal size (following the given sequence), where k is unknown.

Usage

BBSGoF(u, alpha = 0.05, gamma = 0.05, kmin = 2, kmax = min(length(u)%/%10, 100),
 tol = 10, adjusted.pvalues = FALSE, blocks = NA)

Arguments

u
A (non-empty) numeric vector of p-values.
alpha
Numeric value. The significance level of the metatest.
gamma
Numeric value. The p-value threshold, so SGoF looks for significance in the amount of p-values below gamma.
kmin
Numeric value. The smallest allowed number of blocks of correlated tests.
kmax
Numeric value. The largest allowed number of blocks of correlated tests.
tol
Numeric value. The tolerance in model fitting (see Details).
adjusted.pvalues
Logical. Default is FALSE. A variable indicating whether to compute the adjusted p-values.
blocks
Numeric value. The number of existing blocks (see Details).

Value

  • A list containing the following values:
  • RejectionsThe number of effects declared by BB-SGoF with automatic k.
  • FDRThe estimated false discovery rate.
  • Adjusted.pvaluesThe adjusted p-values.
  • effectsA vector with the number of effects declared by BB-SGoF for each value of k.
  • SGoFThe number of effects declared by Conservative SGoF.
  • automatic.blocksThe automatic number of blocks.
  • deleted.blocksA vector with the values of k for which the model gave a poor fit.
  • n.blocksA vector with the values of k for which the model fitted well.
  • pThe average ratio of p-values below gamma.
  • corA vector with the estimated within-block correlation.
  • Tarone.pvaluesA vector with the p-values of Tarone’s test for no correlation.
  • Tarone.pvalue.autoThe p-values of Tarone’s test for the automatic k.
  • beta.parametersThe estimated parameters of the Beta(a,b) model for the automatic k.
  • betabinomial.parametersThe estimated parameters of the Betabinomial(p,rho) model for the automatic k.
  • sd.betabinomial.parametersThe standard deviation of the estimated parameters of the Betabinomial(p,rho) model for the automatic k.
  • dataThe original p-values.
  • adjusted.pvaluesA logical value indicating whether the adjusted p-values have been ordered.
  • blocksGuessed value of k.
  • nThe length of x.
  • alphaThe specified significance level for the metatest.
  • gammaThe specified p-value threshold.
  • kminThe smallest allowed number of blocks of correlated tests.
  • kmaxThe largest allowed number of blocks of correlated tests.
  • tolTolerance in model fitting (see Details).
  • callThe matched call.

encoding

UTF-8

Details

BB-SGoF (de Uña-Álvarez, 2012; Castro-Conde and de Uña-Álvarez, 2013 13/03) is an adaptation of SGoF method for possibly dependent tests. It is initially assumed that the provided vector of p-values are correlated in k blocks of equal size (following the given sequence), where k is unknown. Inference on the number of existing effects is performed following SGoF principles, but replacing the binomial distribution for a beta-binomial in the metatest. The beta-binomial distribution is approximated by the normal distribution; therefore, some caution is needed when the number of tests is small. It is implicitly assumed that the probability p for a p-value to fall below gamma is random, following a beta distribution, Beta(a,b); as a consequence, the number of p-values below gamma in each block generates a random sample from a Betabinomial(p,rho) model, where p=E(p)=a/(a+b) and rho=Var(p)/p(1-p)=1/(a+b+1) are respectively the mean of p and the within-block correlation between two indicators of type I(pi<gamma), I(pj<gamma). The parameters are estimated by maximum likelihood, and the asymptotic normal distribution of the estimated parameters is used to perform the inferences (so caution is needed when the number of p-values is small). Since k is unknown, the method is fitted for each integer ranging from k=kmin to k=kmax, and results for each k are saved. Automatic (conservative) choice of k is also performed; the automatic k is the value of k leading to the smallest amount of declared effects (by effects it is meant null hypotheses to be rejected). The excess of observed significant cases in the beta-binomial metatest are reported as number of existing effects N. Finally, the effects are identified by considering the smallest N p-values. BB-SGoF procedure weakly controls the family-wise error rate (FWER) and the false discovery rate (FDR) at level alpha. That is, the probability of commiting one or more than one type I errors along the multiple tests is bounded by alpha when all the null hypotheses are true. SGoF does not control for FWER nor FDR in the presence of effects. It has been quoted that BB-SGoF provides a good balance between FDR and power, particularly when the number of tests is large, and the effect level is weak to moderate. It is also known that the number of effects declared by BB-SGoF is a 100(1-alpha)% lower bound for the true number of existing effects with p-value below the initial threshold gamma so, interestingly, at probability 1-alpha, the number of false discoveries of BB-SGoF does not exceed the number of false non-discoveries (de Uña-Álvarez, 2012). As for SGoF method, typically the choice alpha=gamma will be used for BB-SGoF; this common value will be set as one of the usual significance levels (0.001, 0.01, 0.05, 0.1). Note however that alpha and gamma have different roles. When adjusted.pvalues=TRUE adjusted p-values are calculated. This are defined in the same spirit of SGoF method, but a guessed value for k must be supplied in the argument blocks. Once k is supplied, the adjusted p-value of a given p-value pi is defined as the smallest alpha0 for which the null hypothesis attached to pi is rejected by BB-SGoF (based on the given k) with alpha=gamma=alpha0. Actually, BBSGoF function provides an approximation of these adjusted p-values by restricting alpha0 to the set of original p-values. The argument tol allows for a stronger (small tol) or weaker (large tol) criterion when removing poor fits of the beta-binomial model. When the variance of the estimated beta-binomial parameters for a given k is larger than tol times the median variance along k=kmin,...,kmax, the particular value of k is discarded. The false discovery rate is estimated by the simple method proposed by: Dalmasso, Broet, Moreau (2005), by taking n=1 in their formula.

References

Castro Conde I and de Uña Álvarez J (2013). Power, FDR and conservativeness of BB-SGoF method for multiple dependent tests: a simulation study. Discussion Papers in Statistics and Operation Research. Report 13/03. Statistics and OR Department. University of Vigo. http://webs.uvigo.es/depc05/reports/13_03.pdf Dalmasso C, Broet P and Moreau T (2005). A simple procedure for estimating the false discovery rate. Bioinformatics 21:660--668 de Uña Álvarez J (2012). The Beta-Binomial SGoF method for multiple dependent tests. Statistical Applications in Genetics and Molecular Biology, Vol. 11, Iss. 3, Article 14.

See Also

summary.BBSGoF,plot.BBSGoF,BY

Examples

Run this code
p<-runif(387)^2  #387 independent p-values, non-uniform intersection null violated

res<-BBSGoF(p)
summary(res)    #automatic number of blocks, number of rejected nulls, 
		#estimated FDR, beta and beta-binomial parameters,
		#Tarone test of no correlation 

par(mfrow=c(2,2))
plot(res)   #Tarone test, within-block correlation, beta density (for automatic k),
	    #and decision plot (number of rejected nulls)

Run the code above in your browser using DataLab