Given two vectors of p-values from the primary and follow-up studies, returns the adjusted p-values for false discovery rate control on replicability claims. The p-value vectors are only for features selected for follow-up.
radjust_pf(pv1, pv2, m, c2 = 0.5, l00 = 0, variant = c("none",
"general_dependency", "use_threshold"), threshold = NULL,
alpha = 0.05)numeric vector of p-values from the primary study which
corresponds to the p-values from the follow-up study (pv2).
numeric vector of p-values from the follow-up study.
the number of features examined in the primary study (> length(pv1)).
the relative boost to the p-values from the follow-up study.
c2 = 0.5 (the default) is recommended. It was observed in simulations to yield
similar power to procedure with the optimal value (which is unknown for real data).
a lower bound of the fraction of features (out of m) with true null hypotheses in both studies. For example, for GWAS on the whole genome, the choice of 0.8 is conservative in typical applications.
the default.
use \(m^*=m\sum_{i=1}^{m}\frac{1}{i}\) instead of m.
c1 is computed given the threshold value.
Both variants guarantee that the procedure that declares as replicated all features with r-values below alpha,
controls the FDR at level alpha, for any type of dependency of the p-values in the primary study.
the selection threshold for p-values from the primary study; must be supplied when variant 'use_threshold' is selected, otherwise ignored.
The FDR level to control.
vector of length of pv1 and pv2, containing the r-values.
When many hypotheses are simultaneously examined in a primary study, and then a subset of hypotheses are forwarded for follow-up in an independent study, it is of interest to know which findings are replicated across studies. As a measure of replicability of significance, we compute the r-value, i.e. the FDR adjusted replicability p-value, for each hypothesis followed-up. This measure is different than the FDR adjusted p-value in a typical meta-analysis, where a p-value close to zero in one of the studies is enough to declare the finding as highly significant. The FDR r-value for a feature is the smallest FDR level at which we can say that the finding is among the replicated ones.
Bogomolov, M. and Heller, R. (2013). Discovering findings that replicate from a primary study of high dimension to a follow-up study. Journal of the American Statistical Association, Vol. 108, No. 504, Pp. 1480-1492.
Heller, R., Bogomolov, M., & Benjamini, Y. (2014). Deciding whether follow-up studies have replicated findings in a preliminary large-scale omics study. Proceedings of the National Academy of Sciences of the United States of America, Vol. 111, No. 46, Pp. 16262<U+2013>16267.
radjust_sym for replicability analysis in two symmetric studies.
# NOT RUN {
data(crohn)
rv <- radjust_pf(pv1 = crohn$pv1, pv2 = crohn$pv1, m = 635547, l00 = 0.8)
rv2 <- radjust_pf(pv1 = crohn$pv1, pv2 = crohn$pv1, m = 635547, l00 = 0.8,
variant="use_threshold",threshold = 1e-5)
# }
Run the code above in your browser using DataLab