scidb_fisher.test: scidb_fisher.test

Description

Performs Fisher's exact test for testing the null of independence of rows and columns in a contingency table with fixed marginals.

Usage

scidb_fisher.test(a,x="x",m="m",n="n",k="k",alternative="two.sided", `eval`=FALSE)

Arguments

a scidb array or scidbdf data frame object.

The x value attribute name (see details below).

The m marginal value attribute name (see details below).

alternative

indicates the alternative hypothesis and must be one of '"two.sided"', '"greater"' or '"less"'.

eval

(Optional) If TRUE, execute the query and store the reult array. Otherwise defer evaluation.

Value

pvalue: the p-value of the test.
estimate: an estimate of the odds ratio. Note that the conditional Maximum Likelihood Estimate (MLE) rather than the unconditional MLE (the sample odds ratio) is used.

Details

For 2 by 2 tables, the null of conditional independence is equivalent to the hypothesis that the odds ratio equals one. "Exact" inference can be based on observing that in general, given all marginal totals fixed, the first element of the contingency table has a non-central hypergeometric distribution with non-centrality parameter given by the odds ratio (Fisher, 1935).

Consider the following 2x2 contingency table:

	Class I YES	Class I NO
SUM	Class II YES	x
a	k = x + a	Class II NO
b	c
SUM	m = x + b	n = a + c
		Class I YES

The x input value specifies the name of the SciDB array attribute that indicates the number of 'yes' events in both classifications. The m input value specifies the name of the SciDB array attribute that indicates the marginal sum of the first column. The n input value specifies the name of the SciDB array attribute that indicates the marginal sum of the second column. The k input value specifies the name of the SciDB array attribute that indicates the marginal sum of the first row.

Examples

Run this code

## Not run: 
# # Create a test array:
# a <- scidb("apply(build(<x:int64>[i=0:0,1,0],2),m,12,n,18,k,17)")
# scidb_fisher.test(a)[]
# 
# # output looks like:
# #   x  m  n  k         pval   estimate
# # 0 2 12 18 17 0.0005367241 0.04693664
# 
# ## End(Not run)

Run the code above in your browser using DataLab