Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Non-parametric multiple comparison tests
Providing distributions are of a similar shape, the Kruskal-Wallis K statistic enables one to test the general hypothesis that all population medians are equal. If this null hypothesis is rejected, the next step is to compare the individual groups. This is done using non-parametric multiple comparison tests. We can group the available methods in the same way as we did for parametric multiple comparison procedures.
Planned orthogonal comparisons
Small numbers of planned orthogonal pairwise comparisons can be done using the Wilcoxon-Mann-Whitney test. For larger numbers of such comparisons, the Dunn-Sidak correction should be applied. If variances are not homogeneous, there is a 'robust' version of the Wilcoxon-Mann-Whitney test known as the Fligner-Policello test. However, it assumes that distributions are symmetrical - which rather limits its usefulness.
All pairwise comparisons
Joint or pairwise ranking
In joint rank tests, the mean ranks (or rank sums) used in the Kruskal-Wallis tests are compared. These tests are therefore different in nature to parametric multiple comparison tests because the significance of a comparison between a pair of treatments depends upon observations from treatments not involved in the comparison. Hence results may change depending on the number of treatments being considered.
In pairwise ranking ranks are assigned afresh just to the two treatments being compared. This has the disadvantage that cycling can arise where group A is greater than group B and group C is greater than group A, but group C is not significantly greater than group B. Such inconsistencies are difficult to explain logically!
Joint rank tests
The simplest of these uses a test analogous to Tukey's test and is known as the Nemenyi joint rank test. Differences between the rank sums of each group are compared to a single honestly significant difference calculated as below:
For unequal sample size one can use the Dunn test. In this test one compare mean ranks, not sums of ranks. Consequently a different range statistic is used for the test.
The Steel-Dwass test is the frequently recommended pairwise ranking test. Each pair of treatments is compared with the Wilcoxon-Mann-Whitney test. For small samples (n = 2-6) and only (k =) 3 groups, convert the calculated U-statistic to the minimum rank sum and compare it with the exact critical values given in Steel