rstatix (version 0.7.2)

wilcox_effsize: Wilcoxon Effect Size


Compute Wilcoxon effect size (r) for:

  • one-sample test (Wilcoxon one-sample signed-rank test);

  • paired two-samples test (Wilcoxon two-sample paired signed-rank test) and

  • independent two-samples test ( Mann-Whitney, two-sample rank-sum test).

It can also returns confidence intervals by bootstap.

The effect size r is calculated as Z statistic divided by square root of the sample size (N) (\(Z/\sqrt{N}\)). The Z value is extracted from either coin::wilcoxsign_test() (case of one- or paired-samples test) or coin::wilcox_test() (case of independent two-samples test).

Note that N corresponds to total sample size for independent samples test and to total number of pairs for paired samples test.

The r value varies from 0 to close to 1. The interpretation values for r commonly in published litterature and on the internet are: 0.10 - < 0.3 (small effect), 0.30 - < 0.5 (moderate effect) and >= 0.5 (large effect).


  comparisons = NULL, = NULL,
  paired = FALSE,
  alternative = "two.sided",
  mu = 0,
  ci = FALSE,
  conf.level = 0.95,
  ci.type = "perc",
  nboot = 1000,


return a data frame with some of the following columns:

  • .y.: the y variable used in the test.

  • group1,group2: the compared groups in the pairwise tests.

  • n,n1,n2: Sample counts.

  • effsize: estimate of the effect size (r value).

  • magnitude: magnitude of effect size.

  • conf.low,conf.high: lower and upper bound of the effect size confidence interval.



a data.frame containing the variables in the formula.


a formula of the form x ~ group where x is a numeric variable giving the data values and group is a factor with one or multiple levels giving the corresponding groups. For example, formula = TP53 ~ cancer_group.


A list of length-2 vectors specifying the groups of interest to be compared. For example to compare groups "A" vs "B" and "B" vs "C", the argument is as follow: comparisons = list(c("A", "B"), c("B", "C"))

a character string specifying the reference group. If specified, for a given grouping variable, each of the group levels will be compared to the reference group (i.e. control group).

If = "all", pairwise two sample tests are performed for comparing each grouping variable levels against all (i.e. basemean).


a logical indicating whether you want a paired test.


a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". You can specify just the initial letter.


a number specifying an optional parameter used to form the null hypothesis.


If TRUE, returns confidence intervals by bootstrap. May be slow.


The level for the confidence interval.


The type of confidence interval to use. Can be any of "norm", "basic", "perc", or "bca". Passed to


The number of replications to use for bootstrap.


Additional arguments passed to the functions coin::wilcoxsign_test() (case of one- or paired-samples test) or coin::wilcox_test() (case of independent two-samples test).


Maciej Tomczak and Ewa Tomczak. The need to report effect size estimates revisited. An overview of some recommended measures of effect size. Trends in Sport Sciences. 2014; 1(21):19-25.


Run this code

# One-sample Wilcoxon test effect size
ToothGrowth %>% wilcox_effsize(len ~ 1, mu = 0)

# Independent two-samples wilcoxon effect size
ToothGrowth %>% wilcox_effsize(len ~ supp)

# Paired-samples wilcoxon effect size
ToothGrowth %>% wilcox_effsize(len ~ supp, paired = TRUE)

# Pairwise comparisons
ToothGrowth %>% wilcox_effsize(len ~ dose)

# Grouped data
ToothGrowth %>%
  group_by(supp) %>%
  wilcox_effsize(len ~ dose)


Run the code above in your browser using DataCamp Workspace