mi.wilcox.test: Multiple Imputation Wilcoxon Rank Sum and Signed Rank Tests

Description

Performs one and two sample Wilcoxon tests on multiple imputed datasets.

Usage

mi.wilcox.test(miData, ...)
# S3 method for default
mi.wilcox.test(miData, x, y = NULL,
        alternative = c("two.sided", "less", "greater"), mu = 0,
        paired = FALSE, exact = NULL, conf.int = TRUE,
        conf.level = 0.95, subset = NULL, ...)
# S3 method for amelia
mi.wilcox.test(miData, x, y = NULL,
        alternative = c("two.sided", "less", "greater"), mu = 0,
        paired = FALSE, exact = NULL, conf.int = TRUE,
        conf.level = 0.95, subset = NULL, ...)
# S3 method for mids
mi.wilcox.test(miData, x, y = NULL,
        alternative = c("two.sided", "less", "greater"), mu = 0,
        paired = FALSE, exact = NULL, conf.int = TRUE,
        conf.level = 0.95, subset = NULL, ...)

Value

A list with class "htest" containing the following components:

statistic: he value of the test statistic with a name describing it.
p.value: the p-value for the test.
pointprob: this gives the probability of observing the test statistic itself (called point-prob).
null.value: the location parameter mu.
alternative: a character string describing the alternative hypothesis.
method: a character string indicating what type of test was performed.
data.name: a character string giving the name(s) of the data.
conf.int: a confidence interval for the location parameter. (Only present if argument conf.int = TRUE.)
estimate: Hodges-Lehmann estimate of the location parameter. (Only present if argument conf.int = TRUE.)

Arguments

miData: list of multiple imputed datasets.
x: name of a variable that shall be tested.
y: an optional name of a variable that shall be tested (paired test) or a variable that shall be used to split into groups (unpaired test).
alternative: a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". You can specify just the initial letter.
mu: a number indicating the true value of the mean (or difference in means if you are performing a two sample test).
paired: a logical indicating whether you want a paired t-test.
exact: a logical indicating whether an exact p-value should be computed.
conf.int: a logical indicating whether a confidence interval should be computed.
conf.level: confidence level of the interval.
subset: an optional vector specifying a subset of observations to be used.
...: further arguments to be passed to or from methods.

Author

Matthias Kohl Matthias.Kohl@stamats.de

Details

For details about the tests see wilcox.exact

We use the median p rule (MPR) for the computation of the p value of the test; see Section 5.3.2 of van Buuren (2018) or Section 13.3 in Heymans and Eekhout (2019). The approach seems to work well in many situations such as logistic regression (Eekhout et al. (2017)) or GAM (Bolt et al. (2022)). However, we are not aware of any work that has investigated the MPR approach for Wilcoxon tests. Hence, this function should be regarded as experimental.

We recommend to use an odd number of imputations.

References

van Buuren, S. (2018). Flexible Imputation of Missing Data. Chapman & Hall/CRC. https://stefvanbuuren.name/fimd/.

Heymans, M.W. and Eekhout, I. (2019). Applied Missing Data Analysis With SPSS and (R)Studio. Self-publishing. https://bookdown.org/mwheymans/bookmi/.

Eekhout, I, van de Wiel, MA, Heymans, MW (2017). Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis. BMC Med Res Methodol, 17, 1:129. tools:::Rd_expr_doi("10.1186/s12874-017-0404-7")

Bolt, MA, MaWhinney, S, Pattee, JW, Erlandson, KM, Badesch, DB, Peterson, RA (2022). Inference following multiple imputation for generalized additive models: an investigation of the median p-value rule with applications to the Pulmonary Hypertension Association Registry and Colorado COVID-19 hospitalization data. BMC Med Res Methodol, 22, 1:148. tools:::Rd_expr_doi("10.1186/s12874-022-01613-w").

Examples

Run this code

## Generate some data
set.seed(123)
x <- rnorm(25, mean = 1)
x[sample(1:25, 5)] <- NA
y <- rnorm(20, mean = -1)
y[sample(1:20, 4)] <- NA
pair <- c(rnorm(25, mean = 1), rnorm(20, mean = -1))
g <- factor(c(rep("yes", 25), rep("no", 20)))
D <- data.frame(ID = 1:45, response = c(x, y), pair = pair, group = g)

## Use Amelia to impute missing values
library(Amelia)
res <- amelia(D, m = 9, p2s = 0, idvars = "ID", noms = "group")

## Per protocol analysis (Exact Wilcoxon rank sum test)
library(exactRankTests)
wilcox.exact(response ~ group, data = D, conf.int = TRUE)
## Intention to treat analysis (Multiple Imputation Exact Wilcoxon rank sum test)
mi.wilcox.test(res, x = "response", y = "group")

## Specifying alternatives
mi.wilcox.test(res, x = "response", y = "group", alternative = "less")
mi.wilcox.test(res, x = "response", y = "group", alternative = "greater")

## One sample test
wilcox.exact(D$response[D$group == "yes"], conf.int = TRUE)
mi.wilcox.test(res, x = "response", subset = D$group == "yes")
mi.wilcox.test(res, x = "response", mu = -1, subset = D$group == "yes",
               alternative = "less")
mi.wilcox.test(res, x = "response", mu = -1, subset = D$group == "yes",
               alternative = "greater")

## paired test
wilcox.exact(D$response, D$pair, paired = TRUE, conf.int = TRUE)
mi.wilcox.test(res, x = "response", y = "pair", paired = TRUE)

## Use mice to impute missing values
library(mice)
res.mice <- mice(D, m = 9, print = FALSE)
mi.wilcox.test(res.mice, x = "response", y = "group")

Run the code above in your browser using DataLab