
Last chance! 50% off unlimited learning
Sale ends in
Performs Fisher's exact test for testing the null of independence of rows and columns in a contingency table with fixed marginals.
fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
hybridPars = c(expect = 5, percent = 80, Emin = 1),
control = list(), or = 1, alternative = "two.sided",
conf.int = TRUE, conf.level = 0.95,
simulate.p.value = FALSE, B = 2000)
either a two-dimensional contingency table in matrix form, or a factor object.
a factor object; ignored if x
is a matrix.
an integer specifying the size of the workspace
used in the network algorithm. In units of 4 bytes. Only used for
non-simulated p-values larger than simulate.p.values=TRUE
may be more
reasonable.
a logical. Only used for larger than
a numeric vector of length 3, by default describing “Cochran's conditions” for the validity of the chisquare approximation, see ‘Details’.
a list with named components for low level algorithm
control. At present the only one used is "mult"
, a positive
integer fexact.c
in the sources of this package.
the hypothesized odds ratio. Only used in the
indicates the alternative hypothesis and must be
one of "two.sided"
, "greater"
or "less"
.
You can specify just the initial letter. Only used in the
logical indicating if a confidence interval for the
odds ratio in a
confidence level for the returned confidence
interval. Only used in the conf.int = TRUE
.
a logical indicating whether to compute
p-values by Monte Carlo simulation, in larger than
an integer specifying the number of replicates used in the Monte Carlo test.
A list with class "htest"
containing the following components:
the p-value of the test.
a confidence interval for the odds ratio.
Only present in the conf.int = TRUE
.
an estimate of the odds ratio. Note that the
conditional Maximum Likelihood Estimate (MLE) rather than the
unconditional MLE (the sample odds ratio) is used.
Only present in the
the odds ratio under the null, or
.
Only present in the
a character string describing the alternative hypothesis.
the character string
"Fisher's Exact Test for Count Data"
.
a character string giving the names of the data.
If x
is a matrix, it is taken as a two-dimensional contingency
table, and hence its entries should be nonnegative integers.
Otherwise, both x
and y
must be vectors of the same
length. Incomplete cases are removed, the vectors are coerced into
factor objects, and the contingency table is computed from these.
For
For alternative = "greater"
is a test of the odds ratio being bigger
than or
.
Two-sided tests are based on the probabilities of the tables, and take as ‘more extreme’ all tables with probabilities less than or equal to that of the observed table, the p-value being the sum of such probabilities.
For larger than hybrid = TRUE
,
asymptotic chi-squared probabilities are only used if the
‘Cochran conditions’ (or modified version thereof) specified by
hybridPars = c(expect = 5, percent = 80, Emin = 1)
are
satisfied, that is if no cell has expected counts less than
1
(= Emin
) and more than 80% (= percent
) of the
cells have expected counts at least 5 (= expect
), otherwise
the exact calculation is used. A corresponding if()
decision
is made for all sub-tables considered.
Accidentally, R has used 180
instead of 80
as
percent
, i.e., hybridPars[2]
in R versions between
3.0.0 and 3.4.1 (inclusive), i.e., the 2nd of the hybridPars
(all of which used to be hard-coded previous to R 3.5.0).
Consequently, in these versions of R, hybrid=TRUE
never made a
difference.
In the workspace
sufficiently, which then may lead to very long running times, using
simulate.p.value = TRUE
may then often be sufficient and hence
advisable.
Simulation is done conditional on the row and column marginals, and works only if the marginals are strictly positive. (A C translation of the algorithm of Patefield (1981) is used.)
Agresti, A. (1990). Categorical data analysis. New York: Wiley. Pages 59--66.
Agresti, A. (2002). Categorical data analysis. Second edition. New York: Wiley. Pages 91--101.
Fisher, R. A. (1935). The logic of inductive inference. Journal of the Royal Statistical Society Series A, 98, 39--54. 10.2307/2342435.
Fisher, R. A. (1962). Confidence limits for a cross-product ratio. Australian Journal of Statistics, 4, 41. 10.1111/j.1467-842X.1962.tb00285.x.
Fisher, R. A. (1970). Statistical Methods for Research Workers. Oliver & Boyd.
Mehta, Cyrus R. and Patel, Nitin R. (1983).
A network algorithm for performing Fisher's exact test in
Mehta, C. R. and Patel, N. R. (1986).
Algorithm 643: FEXACT, a FORTRAN subroutine for Fisher's exact test
on unordered
Clarkson, D. B., Fan, Y. and Joe, H. (1993)
A Remark on Algorithm 643: FEXACT: An Algorithm for Performing
Fisher's Exact Test in
Patefield, W. M. (1981). Algorithm AS 159: An efficient method of generating r x c tables with given row and column totals. Applied Statistics, 30, 91--97. 10.2307/2346669.
fisher.exact
in package exact2x2 for alternative
interpretations of two-sided tests and confidence intervals for
# NOT RUN {
## Agresti (1990, p. 61f; 2002, p. 91) Fisher's Tea Drinker
## A British woman claimed to be able to distinguish whether milk or
## tea was added to the cup first. To test, she was given 8 cups of
## tea, in four of which milk was added first. The null hypothesis
## is that there is no association between the true order of pouring
## and the woman's guess, the alternative that there is a positive
## association (that the odds ratio is greater than 1).
TeaTasting <-
matrix(c(3, 1, 1, 3),
nrow = 2,
dimnames = list(Guess = c("Milk", "Tea"),
Truth = c("Milk", "Tea")))
fisher.test(TeaTasting, alternative = "greater")
## => p = 0.2429, association could not be established
## Fisher (1962, 1970), Criminal convictions of like-sex twins
Convictions <- matrix(c(2, 10, 15, 3), nrow = 2,
dimnames =
list(c("Dizygotic", "Monozygotic"),
c("Convicted", "Not convicted")))
Convictions
fisher.test(Convictions, alternative = "less")
fisher.test(Convictions, conf.int = FALSE)
fisher.test(Convictions, conf.level = 0.95)$conf.int
fisher.test(Convictions, conf.level = 0.99)$conf.int
## A r x c table Agresti (2002, p. 57) Job Satisfaction
Job <- matrix(c(1,2,1,0, 3,3,6,1, 10,10,14,9, 6,7,12,11), 4, 4,
dimnames = list(income = c("< 15k", "15-25k", "25-40k", "> 40k"),
satisfaction = c("VeryD", "LittleD", "ModerateS", "VeryS")))
fisher.test(Job) # 0.7827
fisher.test(Job, simulate.p.value = TRUE, B = 1e5) # also close to 0.78
## 6th example in Mehta & Patel's JASA paper
MP6 <- rbind(
c(1,2,2,1,1,0,1),
c(2,0,0,2,3,0,0),
c(0,1,1,1,2,7,3),
c(1,1,2,0,0,0,1),
c(0,1,1,1,1,0,0))
fisher.test(MP6)
# Exactly the same p-value, as Cochran's conditions are never met:
fisher.test(MP6, hybrid=TRUE)
# }
Run the code above in your browser using DataLab