pcalg (version 2.5-0)

disCItest: G square Test for (Conditional) Independence of Discrete Variables

Description

\(G^2\) test for (conditional) independence of discrete (each with a finite number of “levels”) variables \(X\) and \(Y\) given the (possibly empty) set of discrete variables \(S\).

disCItest() is a wrapper of gSquareDis(), to be easily used in skeleton, pc and fci.

Usage

gSquareDis(x, y, S, dm, nlev, adaptDF = FALSE, n.min = 10*df, verbose = FALSE)
disCItest (x, y, S, suffStat)

Arguments

x,y

(integer) position of variable \(X\) and \(Y\), respectively, in the adjacency matrix.

S

(integer) positions of zero or more conditioning variables in the adjacency matrix.

dm

data matrix (rows: samples, columns: variables) with integer entries; the k levels for a given column must be coded by the integers 0,1,...,k-1. (see example)

nlev

optional vector with numbers of levels for each variable in dm.

adaptDF

logical specifying if the degrees of freedom should be lowered by one for each zero count. The value for the degrees of freedom cannot go below 1.

n.min

the smallest \(n\) (number of observations, nrow(dm)) for which the G^2 test is computed; for smaller \(n\), independence is assumed (\(G^2 := 1\)) with a warning. The default is \(10 m\), where \(m\) is the degrees of freedom assuming no structural zeros, here, the product of all the number of levels (nlev[x]-1) * (nlev[y]-1) * prod(nlev[S]).

verbose

logical or integer indicating that increased diagnostic output is to be provided.

suffStat

a list with three elements, "dm", "nlev", "adaptDF"; each corresponding to the above arguments of gSquareDis().

Value

The p-value of the test.

Details

The \(G^2\) statistic is used to test for (conditional) independence of X and Y given a set S (can be NULL). If only binary variables are involved, gSquareBin is a specialized (a bit more efficient) alternative to gSquareDis().

References

R.E. Neapolitan (2004). Learning Bayesian Networks. Prentice Hall Series in Artificial Intelligence. Chapter 10.3.1

See Also

gSquareBin for a (conditional) independence test for binary variables.

dsepTest, gaussCItest and binCItest for similar functions for a d-separation oracle, a conditional independence test for gaussian variables and a conditional independence test for binary variables, respectively.

Examples

Run this code
# NOT RUN {
## Simulate data
n <- 100
set.seed(123)
x <- sample(0:2, n, TRUE) ## three levels
y <- sample(0:3, n, TRUE) ## four levels
z <- sample(0:1, n, TRUE) ## two levels
dat <- cbind(x,y,z)

## Analyze data
gSquareDis(1,3, S=2, dat, nlev = c(3,4,2)) # but nlev is optional:
gSquareDis(1,3, S=2, dat, verbose=TRUE, adaptDF=TRUE)
## with too little data, gives a warning (and p-value 1):
gSquareDis(1,3, S=2, dat[1:60,], nlev = c(3,4,2))

suffStat <- list(dm = dat, nlev = c(3,4,2), adaptDF = FALSE)
disCItest(1,3,2,suffStat)
# }

Run the code above in your browser using DataCamp Workspace