Learn R Programming

scidb (version 1.2-0)

scidb_fisher.test: scidb_fisher.test

Description

Performs Fisher's exact test for testing the null of independence of rows and columns in a contingency table with fixed marginals.

Usage

scidb_fisher.test(a,x="x",m="m",n="n",k="k",alternative="two.sided", `eval`=FALSE)

Arguments

a
a scidb array or scidbdf data frame object.
x
The x value attribute name (see details below).
m
The m marginal value attribute name (see details below).
n
The m marginal value attribute name (see details below).
k
The m marginal value attribute name (see details below).
alternative
indicates the alternative hypothesis and must be one of '"two.sided"', '"greater"' or '"less"'.
eval
(Optional) If TRUE, execute the query and store the reult array. Otherwise defer evaluation.

Value

A new SciDB array with two new attributes is returned (note that the returned attribute names may be adjusted to account for naming conflicts with existing array attributes):
pvalue
the p-value of the test.
estimate
an estimate of the odds ratio. Note that the conditional Maximum Likelihood Estimate (MLE) rather than the unconditional MLE (the sample odds ratio) is used.

Details

For 2 by 2 tables, the null of conditional independence is equivalent to the hypothesis that the odds ratio equals one. "Exact" inference can be based on observing that in general, given all marginal totals fixed, the first element of the contingency table has a non-central hypergeometric distribution with non-centrality parameter given by the odds ratio (Fisher, 1935).

Consider the following 2x2 contingency table:

Class I YES Class I NO
SUM Class II YES x
a k = x + a Class II NO
b c
SUM m = x + b n = a + c
Class I YES
The x input value specifies the name of the SciDB array attribute that indicates the number of 'yes' events in both classifications. The m input value specifies the name of the SciDB array attribute that indicates the marginal sum of the first column. The n input value specifies the name of the SciDB array attribute that indicates the marginal sum of the second column. The k input value specifies the name of the SciDB array attribute that indicates the marginal sum of the first row.

See Also

scidb phyper qhyper dhyper

Examples

Run this code
## Not run: 
# # Create a test array:
# a <- scidb("apply(build(<x:int64>[i=0:0,1,0],2),m,12,n,18,k,17)")
# scidb_fisher.test(a)[]
# 
# # output looks like:
# #   x  m  n  k         pval   estimate
# # 0 2 12 18 17 0.0005367241 0.04693664
# 
# ## End(Not run)

Run the code above in your browser using DataLab