srt: Signed-rank test

Description

Performs Wilcoxon's signed-rank test.

Usage

srt(
  data,
  formula,
  conf_level = 0,
  conf_method = "inversion",
  n_resamples = 1000L,
  alternative = "two.sided",
  mu = 0,
  distribution = "auto",
  correct = TRUE,
  zero_method = "wilcoxon",
  agg_fun = "error",
  digits_rank = Inf,
  tol_root = 1e-04
)
srt2(
  x,
  y = NULL,
  conf_level = 0,
  conf_method = "inversion",
  n_resamples = 1000L,
  alternative = "two.sided",
  mu = 0,
  distribution = "auto",
  correct = TRUE,
  zero_method = "wilcoxon",
  digits_rank = Inf,
  tol_root = 1e-04
)

Value

A list with the following elements:

Slot	Subslot	Name	Description
1		`p_value`	p-value.
2		`statistic`	Test statistic. $W^+$ for the exact Wilcoxon signed-rank distribution. $Z$ for the asymptotic normal approximation.
3		`pseudomedian`	Measure of centrality.
4		`lower`	Lower bound of confidence interval for the pseudomedian. `NULL` if no CI requested.
5		`upper`	Upper bound of confidence interval for the pseudomedian. `NULL` if no CI requested.
6		`method`	Test method.
7		`info`	Additional test information.
7	1	`p_value_method`	Method used to calculate the p-value.
7	2	`pseudomedian_method`	Method used to calculate the pseudomedian.
7	3	`conf_method`	Method used to calculate the confidence interval.
7	4	`conf_level_achieved`	Achieved confidence level.
7	5	`n_sample`	Number of observations in the original data.
7	6	`n_analytic`	Number of observations after removing non-finite values from the original data.
7	7	`n_zeros`	Number of zeros among differences in the analytic data set.
7	8	`n_signed`	Number of nonzero differences in the analytic data set.
7	9	`n_ties`	Number of tied ranks after ranking the absolute differences.
7	10	`data_type`	Data type.
7	11	`focal_name`	Name of the focal variable (differences are focal - reference).
7	12	`reference_name`	Name of the reference variable (differences are focal - reference).
8		`call`	A named list of the function's arguments (use `as.call()` to convert to a call; `call$distribution` may be updated from `"exact"` to `"asymptotic"`).

Arguments

data

(data.frame)
The data frame of interest.

formula

(formula)
A formula of form:

y ~ group | block: Use when data is in tall format. y is the numeric outcome, group is the binary grouping variable, and block is the subject/item-level variable indicating pairs of observations. group will be converted to a factor and the first level will be the reference value. For example, when levels(data$group) <- c("pre", "post"), the focal level is 'post', so differences are post - pre. Pairs with non-finite values (infinite or missing) are silently dropped. See agg_fun for handling duplicate cases of grouping/blocking combinations.

y ~ x

Use when data is in wide format. y and x must be numeric vectors. Differences are calculated as data$y - data$x. Pairs with non-finite values (infinite or missing) are silently dropped.

~ x

Use when data$x represents pre-calculated differences or for the one-sample case. Values with non-finite values (infinite or missing) are silently dropped.

conf_level

(Scalar numeric: [0, 1))
The confidence level. Typically 0.95. If 0 (default), no confidence interval is calculated.

conf_method

(Scalar character: c("inversion", "percentile", "bca"))
The type of confidence interval. If "inversion" (default), the bounds are computed by inverting the hypothesis test. If "percentile", the bounds are computed using a percentile bootstrap. If "bca", the bounds are computed using a bias-corrected and accelerated (BCa) bootstrap.

n_resamples

(Scalar integer: 1000L; [10, Inf))
The number of bootstrap resamples. Only used if "percentile" or "bca" confidence intervals are chosen.

alternative

(Scalar character: c("two.sided", "greater", "less"))
The alternative hypothesis. Must be one of "two.sided" (default), "greater", or "less".

(Scalar numeric: 0; (-Inf, Inf))
Under the null hypothesis, x or x - y is assumed to be symmetric around mu.

distribution

(Scalar character: c("auto", "exact", "asymptotic"))
The method used to calculate the p-value. If "auto" (default), an appropriate method will automatically be chosen (distribution = "exact" when n < 50 or distribution = "asymptotic" otherwise). If "exact", the exact Wilcoxon signed-rank distribution is used. If "asymptotic", the asymptotic normal approximation is used.

correct

(Scalar logical: c(TRUE, FALSE))
Whether or not to apply a continuity correction to the Z-statistic for the asymptotic approximation of the p-value.

zero_method

(Scalar character: c("wilcoxon", "pratt"))
The method used to handle differences equal to zero. If "wilcoxon" (default), zeros are removed prior to ranking (classic Wilcoxon convention). If "pratt", zeros are retained for ranking but excluded from the signed-rank sum.

agg_fun

(Scalar character or function: "error")
Used for aggregating duplicate cases of grouping/blocking combinations when data is in tall format and formula has structure y ~ group | block. "error" (default) will return an error if duplicate grouping/blocking combinations are encountered. Select one of "first", "last", "sum", "mean", "median", "min", or "max" for built in aggregation handling (each applies na.rm = TRUE). Or define your own function. For example, myfun <- function(x) {as.numeric(quantile(x, 0.75, na.rm = TRUE))}.

digits_rank

(Scalar integer: Inf; (0, Inf])
Controls ranking precision. If finite, ranks are computed from base::signif(abs(diffs), digits_rank). If Inf (default), ranks are computed from abs(diffs). Smaller values may introduce ties (because they no longer depend on extremely small numeric differences) and thus change averaged ranks and tie counts.

tol_root

(Scalar numeric: 1e-4; (0, Inf))
For stats::uniroot(tol=tol_root) calls when conf_level > 0 and distribution = "asymptotic".

(numeric)
Numeric vector of data. Values with non-finite values (infinite or missing) are silently dropped.

(numeric: NULL)
Numeric vector of data or NULL. If NULL (default), a one-sample test is performed using x. If numeric, differences are calculated as x - y. Pairs with non-finite values (infinite or missing) are silently dropped.

Details

The procedure for Wilcoxon's signed-rank test is as follows:

For one-sample data x or paired samples x and y, where mu is the measure of center under the null hypothesis, define the values used for analysis as (x - mu) or (x - y - mu).
Define 'zero' values as (x - mu == 0) or (x - y - mu == 0).
- zero_method = "wilcoxon": Remove values equal to zero.
- zero_method = "pratt": Keep values equal to zero.
Order the absolute values from smallest to largest.
Assign ranks $1, 2, \dots, n$ to the ordered absolute values, using mean rank for ties.
- zero_method = "pratt": remove values equal to zero and their corresponding ranks.
Calculate the test statistic.

Calculate $W^+$ as the sum of the ranks for positive values. Let $r_i$ denote the absolute-value rank of the $i$-th observation after applying the chosen zero handling and ranking precision and $v_i$ denote the values used for analysis. Then $$W^+ = \sum_{i : v_i > 0} r_i.$$ $W^+ + W^- = n(n+1)/2$, thus either can be calculated from the other. If the null hypothesis is true, $W^+$ and $W^-$ are expected to be equal.
- distribution = "exact": Use $W^+$ which takes values between $0$ and $n(n+1)/2$.
- distribution = "asymptotic": Use the standardized test statistic $Z=\frac{W^+ - E_0(W^+) - cc}{Var_0(W^+)^{1/2}}$ where $Z \sim N(0, 1)$ asymptotically.
  - zero_method = "wilcoxon": Under the null hypothesis, the expected mean and variance are $$ \begin{aligned} E_0(W^+) &= \frac{n(n+1)}{4} \\ Var_0(W^+) &= \frac{n(n+1)(2n+1)}{24} - \frac{\sum(t^3-t)}{48}, \end{aligned} $$ where $t$ is the number of ties for each unique ranked absolute value.
  - zero_method = "pratt": Under the null hypothesis, the expected mean and variance are $$ \begin{aligned} E_0(W^+) &= \frac{n(n+1)}{4} - \frac{n_{zeros}(n_{zeros}+1)}{4} \\ Var_0(W^+) &= \frac{n(n+1)(2n+1) - n_{zeros}(n_{zeros}+1)(2n_{zeros}+1)}{24} - \frac{\sum(t^3-t)}{48}, \end{aligned} $$ where $t$ is the number of ties for each unique ranked absolute value.
  - correct = TRUE: The continuity correction is defined by: $$ cc = \begin{cases} 0.5 \times \text{sign}(W^+ - E_0(W^+)) & \text{two-sided alternative} \\ 0.5 & \text{greater than alternative} \\ -0.5 & \text{less than alternative.} \end{cases} $$
  - correct = FALSE: Set $cc = 0$.
Alternatively, $E_0(W^+)$ and $Var_0(W^+)$ can be calculated without need for closed form expressions that include zero correction and/or tie correction. Consider each rank $r_i$ (with averaged rank for ties) as a random variable $X_i$ that contributes to $W^+$. Under the null hypothesis, each rank has 50% chance of being positive or negative. So $X_i$ can take values $$ X_i = \begin{cases} r_i & \text{with probability } p = 0.5 \\ 0 & \text{with probability } 1 - p = 0.5. \end{cases} $$ Using $Var(X_i) = E(X_i^2) - E(X_i)^2$ where $$ \begin{aligned} E(X_i) &= p \cdot r_i + (1 - p) \cdot 0 \\ &= 0.5r_i \\ E(X_i^2) &= p \cdot r_i^2 + (1 - p) \cdot 0^2 \\ &= 0.5r_i^2 \\ E(X_i)^2 &= (0.5r_i)^2, \end{aligned} $$ it follows that $Var(X_i) = 0.5r_i^2 - (0.5r_i)^2 = 0.25r_i^2.$ Hence, $E_0(W^+) = \frac{\sum r_i}{2}$ and $Var_0(W^+) = \frac{\sum r_i^2}{4}.$
Calculate the p-value.
- distribution = "exact"
  - No zeros among the differences and ranks are tie free
    - Use the Wilcoxon signed-rank distribution as implemented in stats::psignrank() to calculate the probability of being as or more extreme than $W^+.$
  - Zeros present and/or ties present
    - Use the Shift-Algorithm from streitberg1984;textualrankdifferencetest as implemented in exactRankTests::pperm().
- distribution = "asymptotic"
  - Use the standard normal distribution as implemented in stats::pnorm() to calculate the probability of being as or more extreme than $Z.$

Hypotheses and test assumptions

The signed-rank test hypotheses are stated as:

Null: (x) or (x - y) is centered at mu.
Two-sided alternative: (x) or (x - y) is not centered at mu.
Greater than alternative: (x) or (x - y) is centered at a value greater than mu.
Less than alternative: (x) or (x - y) is centered at a value less than mu.

The signed-rank test traditionally assumes the differences are independent with identical, continuous, and symmetric distribution. However, not all of these assumptions may be required pratt1981rankdifferencetest. The 'identically distributed' assumption is not required, keeping the level of test as expected for the hypotheses as stated above. The symmetry assumption is not required when using one-sided alternative hypotheses. For example:

Null: (x) or (x - y) is symmetric and centered at mu.
Greater than alternative: (x) or (x - y) is stochastically larger than mu.
Less than alternative: (x) or (x - y) is stochastically smaller than mu.

Zero handling

zero_method = "pratt" uses the method by pratt1959;textualrankdifferencetest, which first rank-transforms the absolute values, including zeros, and then removes the ranks corresponding to the zeros. zero_method = "wilcoxon" uses the method by wilcoxon1950;textualrankdifferencetest, which first removes the zeros and then rank-transforms the remaining absolute values. conover1973;textualrankdifferencetest found that when comparing a discrete uniform distribution to a distribution where probabilities linearly increase from left to right, Pratt's method outperforms Wilcoxon's. When testing a binomial distribution centered at zero to see whether the parameter of each Bernoulli trial is $\frac{1}{2}$, Wilcoxon's method outperforms Pratt's.

Pseudomedians

Hodges-Lehmann estimator

The Hodges-Lehmann estimator is the median of all pairwise averages of the sample values. $$\mathrm{HL} = \mathrm{median} \left\{ \frac{x_i + x_j}{2} \right\}_{i \le j}$$ This pseudomedian is a robust, distribution-free estimate of central tendency for a single sample, or a location-shift estimator for paired data. It's resistant to outliers and compatible with rank-based inference. This statistic is returned when conf_level = 0 (for all test methods) or when confidence intervals are requested for an exact Wilcoxon test with zero-free data and tie-free ranks.

Midpoint around expected signed-rank statistic (exact CI-centered estimator)

The exact test for data which contains zeros or whose ranks contain ties uses the Streitberg-Rohmel Shift-algorithm. When a confidence interval is requested under this scenario, the estimated pseudomedian is a midpoint around the expected rank sum. This midpoint is the average of the largest shift value $d$ whose signed-rank statistic does not exceed the null expectation and the smallest $d$ whose statistic exceeds it.

In detail, let $W^+(d)$ be the Wilcoxon signed-rank sum at shift $d$, and let $E_0$ denote the null expectation (e.g., $\sum r_i / 2$ when ranks are $r_i$). Then $$\hat{d}_{\mathrm{mid}} = \frac{1}{2} \left( \min\{ d : W^+(d) \le \lceil E_0 \rceil \} + \max\{ d : W^+(d) > E_0 \} \right)$$

This pseudomedian is a discrete-compatible point estimate that centers the exact confidence interval obtained by inverting the exact signed-rank distribution. It may differ from the Hodges-Lehmann estimator when data are tied or rounded.

Root of standardized signed-rank statistic (asymptotic CI-centered estimator)

A similar algorithm is used to estimate the pseudomedian when a confidence interval is requested under the asymptotic test scenario. This pseudomedian is the value of the shift $d$ for which the standardized signed-rank statistic equals zero under the asymptotic normal approximation.

In detail, let $W^+(d)$ be the signed-rank sum, with null mean $E_0(d)$ and null variance $\mathrm{Var}_0(d)$ (with possible tie and continuity corrections). Define $$Z(d) = \frac{W^+(d) - E_0(d)}{\sqrt{\mathrm{Var}_0(d)}}.$$ The pseudomedian is the root $$\hat{d}_{\mathrm{root}} = \{ d : Z(d) = 0 \}.$$

This pseudomedian is a continuous-compatible point estimate that centers the asymptotic confidence interval. It's the solution to the test-inversion equation under a normal approximation. It's sensitive to tie/zero patterns through $\mathrm{Var}_0(d)$, may include a continuity correction, and is not guaranteed to equal the Hodges-Lehmann estimator or the exact midpoint.

Confidence intervals

Exact Wilcoxon confidence interval

The exact Wilcoxon confidence interval is obtained by inverting the exact distribution of the signed-rank statistic. It uses the permutation distribution of the Wilcoxon statistic and finds bounds where cumulative probabilities cross $\alpha/2$ and $1-\alpha/2$. Endpoints correspond to quantiles from stats::qsignrank(). This interval guarantees nominal coverage under the null hypothesis without relying on asymptotic approximations. It respects discreteness of the data and may produce conservative intervals when the requested confidence level is not achievable (with warnings).

Exact Wilcoxon confidence interval using the Shift-algorithm

The exact Wilcoxon confidence interval using the Shift-algorithm is obtained by enumerating all candidate shifts and inverting the exact signed-rank distribution. In detail, it generates all pairwise averages $\frac{x_i + x_j}{2}$, evaluates the signed-rank statistic for each candidate shift, and determines bounds using exactRankTests::pperm() and exactRankTests::qperm().

Asymptotic Wilcoxon confidence interval

The asymptotic Wilcoxon confidence interval is obtained by inverting the asymptotic normal approximation of the signed-rank statistic. In detail, Define a standardized statistic: $$Z(d) = \frac{W^+(d) - E_0(d) - cc}{\sqrt{\mathrm{Var}_0(d)}}$$ where $W^+(d)$ is the signed-rank sum at shift $d$. Then solve for $d$ such that $Z(d)$ equals the normal quantiles using stats::uniroot(). This interval may not be reliable for small samples or heavily tied data.

Bootstrap confidence intervals

The percentile and BCa bootstrap confidence interval methods are described in chapter 5.3 of davison1997;textualrankdifferencetest.

History

stats::wilcox.test() is the canonical function for the Wilcoxon signed-rank test. exactRankTests::wilcox.exact() implemented the Streitberg-Rohmel Shift-algorithm for exact inference when zeros and/or ties are present. coin::coin superseded exactRankTests and includes additional methods. srt() reimplements these functions so the best features of each is available in a fast and easy to use format.

References

wilcoxon1945rankdifferencetest

wilcoxon1950rankdifferencetest

pratt1981rankdifferencetest

pratt1959rankdifferencetest

cureton1967rankdifferencetest

conover1973rankdifferencetest

hollander2014rankdifferencetest

bauer1972rankdifferencetest

streitberg1984rankdifferencetest

hothorn2001rankdifferencetest

hothorn2002rankdifferencetest

hothorn2008rankdifferencetest

davison1997rankdifferencetest

exactRankTestsrankdifferencetest

Rrankdifferencetest

Examples

Run this code

#----------------------------------------------------------------------------
# srt() example
#----------------------------------------------------------------------------
library(rankdifferencetest)

# Use example data from Kornbrot (1990)
data <- kornbrot_table1

# Create tall-format data for demonstration purposes
data_tall <- reshape(
  data = kornbrot_table1,
  direction = "long",
  varying = c("placebo", "drug"),
  v.names = c("time"),
  idvar = "subject",
  times = c("placebo", "drug"),
  timevar = "treatment",
  new.row.names = seq_len(prod(length(c("placebo", "drug")), nrow(kornbrot_table1)))
)

# Subject and treatment should be factors. The ordering of the treatment factor
# will determine the difference (placebo - drug).
data_tall$subject <- factor(data_tall$subject)
data_tall$treatment <- factor(data_tall$treatment, levels = c("drug", "placebo"))

# Rate transformation inverts the rank ordering.
data$placebo_rate <- 60 / data$placebo
data$drug_rate <- 60 / data$drug
data_tall$rate <- 60 / data_tall$time

# In contrast to the rank difference test, the Wilcoxon signed-rank test
# produces differing results. See table 1 and table 2 (page 245) in
# Kornbrot (1990).
## Divide p-value by 2 for one-tailed probability.
srt(
  data = data,
  formula = placebo ~ drug,
  alternative = "two.sided",
  distribution = "asymptotic",
  zero_method = "wilcoxon",
  correct = TRUE,
  conf_level = 0.95
)

srt(
  data = data_tall,
  formula = rate ~ treatment | subject,
  alternative = "two.sided",
  distribution = "asymptotic",
  zero_method = "wilcoxon",
  correct = TRUE,
  conf_level = 0.95
)

Run the code above in your browser using DataLab