Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

Search this site




The Wilcoxon signed-ranks test is a non-parametric equivalent of the paired t-test. It is most commonly used to test for a difference in the mean (or median) of paired observations - whether measurements on pairs of units or before and after measurements on the same unit. It can also be used as a one-sample test to test whether a particular sample came from a population with a specified median.

Unlike the t-test, the paired differences do not need to follow a normal distribution. But if you wish to test the median (= mean) difference, the distribution each side of the median - must have a similar shape. In other words the distribution of the differences must be symmetrical. If the distribution of the differences is not symmetrical, you can only test the null hypothesis that the Hodges-Lehmann estimate of the median difference is zero. Unlike most rank tests, this test outcome is affected by a transformation before ranking since differences are ranked in order of their absolute size. It may thus be worth plotting the distribution of the differences after an appropriate transformation (for example logarithmic) to see if it makes the distribution appear more symmetrical.

A signed-ranks upon paired samples is less powerful than the t-test (relative efficiency is about 95%) providing the differences are normally distributed. If they are not, and cannot be transformed such that they are, a paired t-test is not appropriate and the non-parametric test should be used.




  1. Determine the sign of the difference (Di) between each pair of observations.
  2. Examine the distribution of the differences. If it is skewed, a logarithmic transformation of the (raw) data may make the distribution of the differences more symmetrical.
  3. Rank the differences in order of absolute size with a rank of 1 assigned to the smallest difference.
    Differences of zero are (usually) dropped from the analysis. The rank assigned to tied ranks is the mean of the ranks that would have been given if the observations had not been tied.
  4. Reassign the signs of the differences to their respective ranks (Ri).
  5. Then calculate the appropriate test statistic.

Sum of ranks statistic (small samples with no ties):

  1. Calculate S+ and S which are the sums of positive and negative ranks respectively.

    Algebraically speaking -

    S+    =    ΣRi where Di is positive
    S    =    ΣRi where Di is negative

  2. The choice of test statistic depends on how you are obtaining the critical values.

    • If you are using R, then S+ is the test statistic (denoted in R as V).
    • If you are using tables (e.g. Table in Siegel or Table B12 in Zar) for a two-tailed test reject the null hypothesis if the absolute values of either S+ or S are less than or equal to the critical value given in the table. For a one tailed test with H1 that the median of population one is greater than that of population two reject H0 if S is greater than the tabulated one-tailed value. For a one tailed test with H1 that the median of population two is greater than that of population one reject H0 if S+ is less than the tabulated one-tailed value.

Large sample normal approximation

The large sample approximation is only appropriate for n > 20. However, it is still commonly used for smaller sample sizes if there are ties in the data. This is because table values for exact tests are only valid for untied data. Nowadays software is available (including R) to carry out exact tests even when there are tied data - so inappropriate use of the normal approximation is unjustified.

Algebraically speaking -

For no ties -

z    =    ΣRi
[n(n + 1)(2n + )]/6

For when ties are present -

z    =    ΣRi
  • z is tested against the standard normal deviate Z
  • n is the number of pairs
  • Ri the (signed) rank of the absolute difference between the ith pair of values - where its sign is the same as that of the ith difference.

Where data are tied, if there is no difference between a pair then, although its signed rank (Ri) is zero, its presence increases the rank of all other pairs by one. In which case, whilst E(Ri) remains zero, so ΣRi2 will give a biased estimate of the variance of ΣRi - in which case this method must assume pairs are not tied. Since, this approximation assumes there are few ties between differences, it may be unreliable when applied to strongly discrete data - and most especially if that data is highly skewed.


Confidence interval to the median difference

The differences between pairs of observations are first arranged in rank order. A triangular matrix of the Walsh averages (the means of all possible pairs of values) is then constructed. The Hodges-Lehmann estimate of the median difference is given by the median of these values.

The upper and lower 95% confidence limits to this median are obtained by counting in a specified number of Walsh averages from each end of the array. The required number of averages is given by the quantile of the Wilcoxon matched-pairs signed-ranks statistic for n observations at P = 0.025.




  1. The paired differences are independent. Note that it is not assumed that the two samples are independent of each other - indeed they should be related such as with matched pairs in a case control study, or before and after measurements on the same unit. But the pairs must be independent - so beware if your data are obtained in a time series or using cluster sampling!
  2. The measurement scale is such that the paired differences can be ranked. It might therefore appear that measurement needs to be on the interval scale of measurement, and some authorities take this to be the case. However, strictly speaking one only needs to know that a difference between a score of 10 and 50 is greater than the difference between a score of 10 and 20 - not that the difference is four times greater (40 rather than 10). Such a scale is intermediate between an ordinal scale and an interval scale and is known as an ordered metric scale.
  3. If you are testing the null hypothesis that the mean (= median) of the paired differences is zero, then the paired differences must all come from a continuous symmetrical distribution. Note that we do not have to assume that the distributions of the original populations are symmetrical - two very positively skewed distributions that differ only by location will produce a set of paired differences that are symmetrical. We also assume that the paired differences all have the same mean (= median). If you are testing the null hypothesis that the Hodges-Lehmann estimate of the median difference is zero, then the assumption of symmetry is not required.
  4. There must be at least 5 pairs of observations - otherwise the test cannot give a significant result irrespective of the difference between the two populations.