np.cor.test: Nonparametric Tests of Correlation Coefficients

Description

Denoting the Pearson product-moment correlation coefficient as $$\rho = Cov(X, Y) / \sqrt{Var(X) Var(Y)}$$ this function implements permutation tests of $H_0: \rho = \rho_0$ where $\rho_0$ is the user-specified null value. Can also implement tests of partial correlations, semi-partial (or part) correlations, and independence.

Usage

np.cor.test(x, y, z = NULL,
            alternative = c("two.sided", "less", "greater"),
            rho = 0, independent = FALSE, partial = TRUE,
            R = 9999, parallel = FALSE, cl = NULL,
            perm.dist = TRUE, na.rm = TRUE)

Value

statistic: Test statistic value.
p.value: p-value for testing $H_0: \rho = \rho_0$ or $H_0: F_{XY}(x,y) = F_X(x) F_Y(y)$.
perm.dist: Permutation distribution of statistic.
alternative: Alternative hypothesis.
null.value: Null hypothesis value for $\rho$.
independent: Independence test?
R: Number of resamples.
exact: Exact permutation test? See Note.
estimate: Sample estimate of correlation coefficient $\rho$.

Arguments

x: $X$ vector (n by 1).
y: $Y$ vector (n by 1).
z: Optional $Z$ matrix (n by q). If provided, the partial (or semi-partial if partial = FALSE) correlation is calculated between x and y controlling for z.
alternative: Alternative hypothesis. Must be either "two.sided" ($H_1: \rho \neq \rho_0$), "less" ($H_1: \rho < \rho_0$), or "greater" ($H_1: \rho > \rho_0$).
rho: Null hypothesis value $\rho_0$. Defaults to zero.
independent: If FALSE (default), the null hypothesis is $H_0: \rho = \rho_0$. Otherwise, the null hythpothesis is that $X$ and $Y$ are independent, i.e., $H_0: F_{XY}(x,y) = F_X(x) F_Y(y)$.
partial: Only applicable if z is provided. If TRUE (default), the partial correlation between x and y controlling for z is tested. Otherwise the semi-partial correlation is tested. See Details.
R: Number of resamples for the permutation test (positive integer).
parallel: Logical indicating if the parallel package should be used for parallel computing (of the permutation distribution). Defaults to FALSE, which implements sequential computing.
cl: Cluster for parallel computing, which is used when parallel = TRUE. Note that if parallel = TRUE and cl = NULL, then the cluster is defined as makeCluster(2L) to use two cores. To make use of all available cores, use the code cl = makeCluster(detectCores()).
perm.dist: Logical indicating if the permutation distribution should be returned.
na.rm: If TRUE (default), the arguments x and y (and z if provided) are passed to the na.omit function to remove cases with missing data.

Author

Nathaniel E. Helwig <helwig@umn.edu>

Details

Default use of this function tests the Pearson correlation between $X$ and $Y$ using the studentized test statistic proposed by DiCiccio and Romano (2017). If independent = TRUE, the classic (unstudentized) test statistic is used to test the null hypothesis of independence.

If $Z$ is provided, the partial or semi-partial correlation between $X$ and $Y$ controlling for $Z$ is tested. For the semi-partial correlation, the effect of $Z$ is partialled out of $X$.

References

DiCiccio, C. J., & Romano, J. P. (2017). Robust permutation tests for correlation and regression coefficients. Journal of the American Statistical Association, 112(519), 1211-1220. doi: 10.1080/01621459.2016.1202117

Helwig, N. E. (2019). Statistical nonparametric mapping: Multivariate permutation tests for location, correlation, and regression problems in neuroimaging. WIREs Computational Statistics, 11(2), e1457. doi: 10.1002/wics.1457

Pitman, E. J. G. (1937b). Significance tests which may be applied to samples from any populations. ii. the correlation coefficient test. Supplement to the Journal of the Royal Statistical Society, 4(2), 225-232. doi: 10.2307/2983647

Examples

Run this code

# generate data
rho <- 0.5
val <- c(sqrt(1 + rho), sqrt(1 - rho))
corsqrt <- matrix(c(val[1], -val[2], val), 2, 2) / sqrt(2)
set.seed(1)
n <- 10
z <- cbind(rnorm(n), rnorm(n)) %*% corsqrt
x <- z[,1]
y <- z[,2]

# test H0: rho = 0
set.seed(0)
np.cor.test(x, y)

# test H0: X and Y are independent
set.seed(0)
np.cor.test(x, y, independent = TRUE)

Run the code above in your browser using DataLab