np.lm.test: Nonparametric Tests of Linear Model Terms

Description

Performs type III sums-of-squares tests of linear model terms or coefficients.

Usage

np.lm.test(formula, data, ..., anova.test = TRUE,
           method = "perm", homosced = FALSE, lambda = 0, 
           R = 9999, parallel = FALSE, cl = NULL,
           perm.dist = TRUE, na.rm = TRUE)

Value

statistic: Test statistic values (one for each term or coefficient).
p.value: p-values for testing \(H_0: \beta_j = 0\).
perm.dist: Permutation distribution of statistic.
method: Method used for permutation test. See Examples.
homosced: Homoscedastic errors?
lambda: Ridge parameters.
R: Number of resamples.
exact: Exact permutation test? See Note.
coefficients: Least squares estimates of intercept and slope coefficients.
se.coef: Standard errors of estimated coefficients.
signif.table: Data frame with type III tests of model terms of coefficients.
anova.test: Testing terms (TRUE) or coefficients (FALSE).

Arguments

formula: Model formula as used by the lm function.
data: Optional data frame containing variables used in formula.
...: Additional arguments passed to the lm function, e.g., weights, offset, contrasts, etc.
anova.test: If TRUE (default), returns tests of model terms (like anova.lm but using type III SS). Otherwise returns type III SS tests of individual coefficients (like summary.lm)
method: Permutation method: perm, flip, or both. See np.reg.test for further details.
homosced: Are the \(\epsilon\) terms homoscedastic? If FALSE (default), a robust Wald test statistic is used. Otherwise the classic \(F\) test statistic is used.
lambda: Scalar or vector of ridge parameter(s). Defaults to vector of zeros.
R: Number of resamples for the permutation test (positive integer).
parallel: Logical indicating if the parallel package should be used for parallel computing (of the permutation distribution). Defaults to FALSE, which implements sequential computing.
cl: Cluster for parallel computing, which is used when parallel = TRUE. Note that if parallel = TRUE and cl = NULL, then the cluster is defined as makeCluster(2L) to use two cores. To make use of all available cores, use the code cl = makeCluster(detectCores()).
perm.dist: Logical indicating if the permutation distribution should be returned.
na.rm: If TRUE (default), the arguments x and y are passed to the na.omit function to remove cases with missing data.

Author

Nathaniel E. Helwig <helwig@umn.edu>

Details

The recommended default of method = "perm" is equivalent to using Manly's (1986) permutation method separately for each of the model terms. Assuming that the random seed is set the same for each variable's test, equivalent results could be obtained from repeated calls to np.reg.test where a different term/coefficient is tested each time (see Example 2). This implementation is more efficient than repeated calls to np.reg.test because this function computes all of the type III SS tests simultaneously for each permutation.

References

DiCiccio, C. J., & Romano, J. P. (2017). Robust permutation tests for correlation and regression coefficients. Journal of the American Statistical Association, 112(519), 1211-1220. doi: 10.1080/01621459.2016.1202117

Helwig, N. E. (2019a). Statistical nonparametric mapping: Multivariate permutation tests for location, correlation, and regression problems in neuroimaging. WIREs Computational Statistics, 11(2), e1457. tools:::Rd_expr_doi("10.1002/wics.1457")

Helwig, N. E. (2019b). Robust nonparametric tests of general linear model coefficients: A comparison of permutation methods and test statistics. NeuroImage, 201, 116030. tools:::Rd_expr_doi("10.1016/j.neuroimage.2019.116030")

Manly, B. (1986). Randomization and regression methods for testing for associations with geographical, environmental and biological distances between populations. Researches on Population Ecology, 28(2), 201-218. tools:::Rd_expr_doi("10.1007/BF02515450")

Examples

Run this code


### Example 1:  anova.test and homosced options

# data generation design
n <- 90
z <- factor(rep(LETTERS[1:3], times = 30))
x <- seq(-1, 1, length.out = n)
tau <- c(-1, 0, 1)

# generate data
set.seed(0)
y <- tau[z] + 2 * x + rnorm(n)
data <- data.frame(x = x, y = y, z = z)

# test of model terms (heteroscedastic)
set.seed(1)
np.lm.test(y ~ x + z, data = data)

# test of coefficients (heteroscedastic)
set.seed(1)
np.lm.test(y ~ x + z, data = data, anova.test = FALSE)

# test of model terms (homoscedastic)
set.seed(1)
np.lm.test(y ~ x + z, data = data, homosced = TRUE)

# test of coefficients (homoscedastic)
set.seed(1)
np.lm.test(y ~ x + z, data = data, homosced = TRUE, anova.test = FALSE)


### Example 2:  equivalence with np.reg.test()

# type III tests of all coefficients
set.seed(1)
mod.lm <- np.lm.test(y ~ x + z, data = data, anova.test = FALSE)

# make design matrix
xmat <- model.matrix(y ~ x + z, data = data)[,-1]

# test effect of x given zB and zC
set.seed(1)
mod.x <- np.reg.test(x = xmat[,1], y = y, z = xmat[,2:3], method = "MA")

# test effect of zB given x and zC
set.seed(1)
mod.zB <- np.reg.test(x = xmat[,2], y = y, z = xmat[,c(1,3)], method = "MA")

# test effect of zC given x and zB
set.seed(1)
mod.zC <- np.reg.test(x = xmat[,3], y = y, z = xmat[,1:2], method = "MA")

# compare np.lm.test() and np.reg.test() results --- identical!
mod.reg <- data.frame(terms = colnames(xmat), df = rep(1, 3), 
                      statistic = c(mod.x$stat, mod.zB$stat, mod.zC$stat),
                      p.value = c(mod.x$p.valu, mod.zB$p.valu, mod.zC$p.valu))
mod.lm$signif.table
mod.reg

Run the code above in your browser using DataLab