Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Wilcoxon matched-pairs signed-ranks test
Worked example I
The infection was successfully eradicated using combination therapy in 18 of these patients. The impact of eradication was studied in detail in nine of these patients - the data below show the number of episodes in these nine patients over (a median of) 10 months pre-treatment and 10 months post eradication .
We first examine the distribution of differences to check whether it is symmetrical. This is obviously difficult with such a small sample, but an obviously skewed distribution might give us rather less confidence in our result.
The distribution is clearly not symmetrical, although one could argue that it is not strongly skewed, and that the irregularities simply result from small sample size. We will continue with this test, but bear in mind that conclusions based on a borderline P-value would be unsafe.
Given we have a small number of observations, we should use an exact method rather than the normal approximation. S+ and S− are 45 and 0 respectively.
Using tables we find that the critical value for n = 9 at P = 0.005 is 2. Since S− is less than this we might be tempted to accept the difference as significant at P < 0.005. However, this result is unreliable because the table values are only accurate if there are no ties.
If we have the exact Wilcoxon test which is available in a separate package in R, we can still obtain the correct P-value. This is 0.0039, as quoted by the author of the paper. The estimate of the median difference is 8.75 (95% CI: 5.5 - 12.0). We may conclude that that there was a significant decline in the number of oedematous episodes post treatment.
Normal approximation method
This gives a two-tailed P-value of 0.007633.
Using R with the Wilcoxon test in the standard statistics package gives the same value. It may seem surprising that the normal approximation test is more liberal than the exact test. This is most likely because the sample size is so small that the normal approximation is unreliable..
Exact confidence limits
To obtain the Hodges-Lehmann estimate of the median difference and its 95% confidence interval, we first arrange the differences (d) from the table above in order:
3 5 6 9 10 11 12 12 13
We then construct a triangular matrix of the Walsh averages, a task most easily achieved in
The 5 highest values in this array are shown in turquoise cells - the 6th is in a red cell and is the upper 95% confidence limit. The 5 lowest values in the array are shown in green cells - the 6th is in a red cell and is the lower 95% confidence limit. We conclude that the 95% confidence interval for the difference is 6 to 12. This interval does not overlap zero in agreement with our earlier significant P-value of 0.0076.
Whilst this study gave us a manageable worked example, the sample size is really too small for the normal approximation.
Worked example II
We take for our second example from Ogata & Takeuchi et al (2001)
We first examine the distribution of differences to check whether the distribution is symmetrical. Distributions on the left are of the untransformed differences - they are strongly right skewed.
A log transformation reduces skew a little, but the effect is disappointingly slight (note we used a log(x+1) transformation here as several of the post-treatment readings were zero). For now we will proceed with the untransformed data - but bear in mind that the conditions for the test are not met, and we may get a misleading result.
This gives a two-tailed P-value of 0.000007. This indicates a highly significant treatment effect.
But note that the Hodges-Lehmann estimate of the median difference given by R is only 4.5 (95% confidence interval: 2.5 - 10.0). In other words, although there was a dramatic effect of treatment for a few cats, for most cats the effect was rather slight.
As we pointed out above, the key assumption for the Wilcoxon matched pairs test (symmetrical distribution of differences) is not met. Hence we would actually do better to resort to the sign
This gives a P-value of 0.000015, not as highly significant as with Wilcoxon's matched pairs test, but a much more justifiable procedure.