mann_whitney_test_pv() performs an exact or approximate
Wilcoxon-Mann-Whitney U test about the location shift between two
independent groups when the data is not necessarily normally distributed. In
contrast to stats::wilcox.test(), it is vectorised and only calculates
p-values. Furthermore, it is capable of returning the discrete p-value
supports, i.e. all observable p-values under a null hypothesis. Multiple
tests can be evaluated simultaneously.
mann_whitney_test_pv(
x,
y,
mu = 0,
alternative = "two.sided",
exact = NULL,
correct = TRUE,
digits_rank = Inf,
simple_output = FALSE
)If simple.output = TRUE, a vector of computed p-values is returned.
Otherwise, the output is a DiscreteTestResults R6 class object, which
also includes the p-value supports and testing parameters. These have to be
accessed by public methods, e.g. $get_pvalues().
numerical vectors forming the samples to be tested or lists of numerical vectors for multiple tests.
numerical vector of hypothesised location shift(s).
character vector that indicates the alternative hypotheses; each value must be one of "two.sided" (the default), "less" or "greater".
logical value that indicates whether p-values are to be calculated by exact computation (TRUE; the default) or by a continuous approximation (FALSE).
logical value that indicates if a continuity correction is to be applied (TRUE; the default) or not (FALSE). Ignored, if exact = TRUE.
single number giving the significant digits used to compute ranks for the test statistics.
logical value that indicates whether an R6 class object, including the tests' parameters and support sets, i.e. all observable p-values under each null hypothesis, is to be returned (see below).
We use a test statistic called the Wilcoxon Rank Sum Statistic, defined by
$$U = \sum_{i = 1}^{n_X}{rank(X_i)} - \frac{n_X(n_X + 1)}{2},$$
where \(rank(X_i)\) is the rank of \(X_i\) in the concatenated sample
of \(X\) and \(Y\), and \(n_X\) and \(n_Y\) are the respective
sizes of the samples \(X\) and \(Y\). Note that \(U\)
can range from \(0\) to \(n_X \cdot n_Y\).
This is the same statistic used by stats::wilcox.test() and
whose distribution is accessible with pwilcox.
This is also the statistic defined by the two given references.
Note, however, that it is not what is called the Mann-Whitney U Statistic
in the (English-language) Wikipedia article (as of February 12, 2026). The
latter is defined as, using our notation, \(\min(U, n_X \cdot n_Y - U)\).
Using the Wikipedia notation, the Wilcoxon Rank Sum Statistic is \(U_2\).
The parameters x, y, mu and alternative are vectorised. If x and
y are lists, they are replicated automatically to have the same lengths. In
case x or y are not lists, they are added to new ones, which are then
replicated to the appropriate lengths. This allows multiple hypotheses to be
tested simultaneously.
In the presence of ties, computation of exact p-values is not possible.
Therefore, exact is ignored in these cases and p-values of the
respective test settings are calculated by a normal approximation.
By setting exact = NULL, exact computation is performed if both samples in
a test setting do not have any ties and if both sample sizes are lower than
or equal to 200.
If digits_rank = Inf (the default), rank() is used to
compute ranks for the tests statistics instead of
rank(signif(., digits_rank))
Mann, H. D. & Whitney, D. R. (1947). On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann. Math. Statist., 18(1), pp. 50-60. tools:::Rd_expr_doi("10.1214/aoms/1177730491")
Hollander, M. & Wolfe, D. (1973). Nonparametric Statistical Methods. Third Edition. New York: Wiley. pp. 115-135. tools:::Rd_expr_doi("10.1002/9781119196037")
stats::wilcox.test(), pwilcox, wilcox_test_pv()
# Constructing
set.seed(1)
r1 <- rnorm(100)
r2 <- rnorm(100, 1)
# Exact two-sided p-values and their supports
results_ex <- mann_whitney_test_pv(r1, r2)
print(results_ex)
results_ex$get_pvalues()
results_ex$get_pvalue_supports()
# Normal-approximated one-sided p-values ("less") and their supports
results_ap <- mann_whitney_test_pv(r1, r2, alternative = "less", exact = FALSE)
print(results_ap)
results_ap$get_pvalues()
results_ap$get_pvalue_supports()
Run the code above in your browser using DataLab