twoby2Calibrate: Minimum Bayes factors and p-values from Fisher's exact test for 2x2 contingency tables

Description

Computes a sample-size adjusted lower bound on the Bayes factor (for the point null hypothesis against the alternative) for the given 2x2 contingency table. Also returns p-values from Fisher's exact test (different versions in the two-sided case) and less conservative alternatives such as a mid p-value (see Details for more information).

Usage

twoby2Calibrate(x, type="two.sided", alternative="normal", 
                direction=NULL, transform.bf="id")

Arguments

a 2x2 contingency table in matrix form

type

either "one.sided" or "two.sided". Defaults to "two.sided". Specifies if Fisher's exact test (and the corresponding p-value) is one-sided or two-sided.

alternative

either "simple" or "normal". Defaults to "normal". Specifies the alternative hypotheses for the (log) odds ratio to consider for two-sided tests. Is ignored if type="one.sided" (in this case only simple alternatives are available).

direction

either "greater", "less" or NULL. Defaults to NULL. Specifies the direction of the alternative for one-sided tests: "greater" corresponds to an odds ratio > 1 and "less" to an odds ratio < 1. Is ignored if type="two.sided".

transform.bf

either "id", "log", "log2" or "log10". Defaults to "id". Specifies how to transform the lower bound on the Bayes factor. "id" corresponds to no transformation. "log" refers to the natural logarithm, "log2" to the logarithm to the base 2 and "log10" to the logarithm to the base 10.

Value

A list of the following two elements:

minBF

the lower bound on the Bayes factor

p.value

A vector of 3 one-sided p-values/significance measures for one-sided tests, namely first the p-value p.fi from Fisher's exact test, second the corresponding mid p-value p.mid and third the Bayesian posterior probability p.lie (see Details for more information). A vector of 5 two-sided p-values/significance measures for two-sided tests: The first three p-values p.pb, p.ce and p.bl (see Details for the definitions) correspond to two-sided p-values from Fisher's exact test. The 4th quantity p.mid is a mid p-value, namely the mid-p modification of the second p-value p.ce. The last element p.lie is a Bayesian significance measure (see Details for additional information).

Warning

For 2x2 tables with entries equal to 0, the minimum Bayes factor is either not defined (for alternative="normal") or the underlying numerical optimization is unstable (for alternative="simple"). A warning is displayed in such cases and minBF=NA is returned, but the different p-values/significance measures are still available.

Details

If type="two.sided", the point null hypothesis that the odds ratio is 1 is tested against specific two-sided alternatives: alternative="simple" considers all two-point distributions symmetric around 0 for the log odds ratio. alternative="normal" assumes a local normal prior distribution (a so-called g-prior) centered around 0 for the log odds ratio.

In the one-sided case (type="one.sided"), direction="less" tests the alternative that the odds ratio is less than 1 and considers simple point alternatives in that direction to compute the lower bound on the Bayes factor. "greater" does the same for the alternative that the odds ratio is larger than 1.

The calibration obtained with type="two.sided", alternative="normal" is based on the methodology proposed in Li & Clyde (2018) and yields an (approximate) lower bound on the Bayes factor in closed form. All the other lower bounds on the Bayes factor are computed by numerical optimization. For type="two.sided", the two calibrations are described in Ott & Held (2019).

For one-sided alternatives, the following p-value and 2 related quantities are computed:

p.fi is the one-sided p-value from Fisher's exact test.
p.mid is a "mid" p-value. It is obtained by subtracting half of the probability mass of the observed table from p.fi.
p.lie is a Bayesian posterior probability. If direction="greater", it is the posterior probability that the odds ratio exceeds 1 given the observed table under the assumption of uniform priors on the success probabilities for the two groups. If direction="less", it is the posterior probability that the odds ratio does not exceed 1 given the observed table under the same priors.

For two-sided alternatives, the following 3 p-values and 2 related quantities are computed:

p.pb is the "probability-based" p-value (the classical choice), defined as the sum of the probabilities of all tables which are at most a likely as the observed table and have the same marginals.
p.ce is the "central" p-value, which is twice the minimum one-sided p-value (from Fisher's exact test), bounded by 1.
p.bl is "Blaker's" p-value, which is the minimum one-sided p-value (from Fisher's exact test) plus the largest tail probability from the other tail of the distribution that does not exceed that minimum.
p.mid is a "mid" p-value. It is the mid-p modification of the central p-value, i.e. it equals twice the minimum one-sided mid p-value.
p.lie is a two-sided version of the posterior probability for the one-sided test. Let p.lie.os be the one-sided posterior probability that the odds ratio does not exceed 1 given the observed table, as returned by the one-sided test with direction="less". Then p.lie = 2 min{p.lie.os, 1-p.lie.os}.

For one-sided alternatives, the posterior probability p.lie was already studied in Liebermeister (1877) and its frequentist properties are investigated in Seneta & Phipps (2001).

For two-sided alternatives, the 3 p-values from Fisher's exact test are defined in equations (2.24)-(2.26) in Kateri (2014) and computed using the function exact2x2() in the package exact2x2. The "mid" p-value is described in Rothman & Greenland (1998, pp. 222-223). The Bayesian significance measure p.lie is proposed in Ott & Held (2019) as a modification of the corresponding one-sided significance measure.

References

Li, Y. and Clyde, M. A. (2018). Mixtures of g-priors in generalized linear models. Journal of the American Statistical Association, 113:524, 1828--1845. https://doi.org/10.1080/01621459.2018.1469992

Liebermeister, C. (1877). Ueber Wahrscheinlichkeitsrechnung in Anwendung auf therapeutische Statistik. Sammlung klinischer Vortraege. Innere Medizin, 110(31--64), 935-- 962.

Kateri, M. (2014). Contingency Table Analysis - Methods and Implementation using R. Statistics for Industry and Technology. Birkhaeuser.

Ott, M. and Held, L. (2019). Bayesian calibration of p-values from Fisher's exact test. International Statistical Review 87(2), 285--305. https://doi.org/10.1111/insr.12307

Rothman, K. J. and Greenland, S. (1998). Modern Epidemiology. 2nd ed. Lippincott-Raven.

Seneta, E. and Phipps, M. C. (2001). On the comparison of two observed frequencies. Biometrical Journal, 43(1), 23--43.

Examples

Run this code

# NOT RUN {
tab <- matrix(c(1,15,5,10), nrow=2, byrow=TRUE)
# different minimum Bayes factors
twoby2Calibrate(x=tab, type="one.sided", direction="greater")$minBF
twoby2Calibrate(x=tab, type="one.sided", direction="less")$minBF
twoby2Calibrate(x=tab, type="two.sided", alternative="simple")$minBF
twoby2Calibrate(x=tab)$minBF
# one-sided p-values
twoby2Calibrate(x=tab, type="one.sided", direction="greater")$p.value
twoby2Calibrate(x=tab, type="one.sided", direction="less")$p.value
# two-sided p-values
twoby2Calibrate(x=tab)$p.value
# }

Run the code above in your browser using DataLab