condIndFisherZ: Conditional Independence by Fisher's Z-Transformation

Description

Using Fisher's z-transformation of the partial correlation, test for zero partial correlation for sets of normally distributed random variables.

Usage

condIndFisherZ(x, y, S, C, n, cutoff,
               verbose= isTRUE(getOption("verbose.pcalg.condIFz")))
zStat(x, y, S, C, n)

Arguments

x,y,S

it is tested, whether x and y are conditionally independent given the subset S of the remaining nodes. x, y, S all are integers, corresponding to variable or node numbers.

correlation matrix of nodes

integer specifying the number of observations (samples) used to estimate the correlation matrix C.

cutoff

numeric cutoff for significance level of individual partial correlation tests. Must be set to qnorm(1 - alpha/2) for a test significance level of alpha.

verbose

logical indicating whether some intermediate output should be shown (WARNING: This decreases the performance dramatically!)

Value

zStat() gives a number $$Z = \sqrt{n - \left|S\right| - 3} \cdot \log((1+r)/(1-r))/2$$ which is asymptotically normally distributed under the null hypothesis of correlation 0.
condIndFisherZ() returns a logical $L$ indicating whether the partial correlation of x and y given S is zero could not be rejected on the given significance level. More intuitively and for multivariate normal data, this means: If TRUE then it seems plausible, that x and y are conditionally independent given S. If FALSE then there was strong evidence found against this conditional independence statement.

Details

For gaussian random variables and after performing Fisher's z-transformation of the partial correlation, the test statistic zStat() is (asymptotically for large enough n) standard normally distributed. Partial correlation is tested in a two-sided hypothesis test, i.e., basically, condIndFisherZ(*) == abs(zStat(*)) > qnorm(1 - alpha/2). In a multivariate normal distribution, zero partial correlation is equivalent to conditional independence.

References

Markus Kalisch and Peter B"uhlmann (2005) Estimating high-dimensional directed acyclic graphs with the PC-algorithm; Research Report Nr.~130, ETH Zurich; http://stat.ethz.ch/research/research_reports/2005

Examples

Run this code

set.seed(42)
## Generate four independent normal random variables
n <- 20
data <- matrix(rnorm(n*4),n,4)
## Compute corresponding correlation matrix
corMatrix <- cor(data)
## Test, whether variable 1 (col 1) and variable 2 (col 2) are
## independent given variable 3 (col 3) and variable 4 (col 4) on 0.05
## significance level
x <- 1
y <- 2
S <- c(3,4)
n <- 20
alpha <- 0.05
cutoff <- 1-qnorm(alpha/2)
(b1 <- condIndFisherZ(x,y,S,corMatrix,n,cutoff))
   # -> 1 and 2 seem to be conditionally independent given 3,4

## Now an example with conditional dependence
data <- matrix(rnorm(n*3),n,3)
data[,3] <- 2*data[,1]
corMatrix <- cor(data)
(b2 <- condIndFisherZ(1,3,2,corMatrix,n,cutoff))
   # -> 1 and 3 seem to be conditionally dependent given 2

Run the code above in your browser using DataLab