Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)



Multiple comparison tests after parametric ANOVA

Worked example 1

Our first worked example returns to the study by Johnston et al. (2001) on serum Vitamin E concentrations in German Shepherd (GS) dogs with and without a degenerative nerve disorder compared to other breeds of dogs without the disorder.

Table of means
Group Mean SE
GS with disease (A)
GS without disease (B)
Other breeds (C)
Original data are given in - for this worked example we have just taken the first twenty observations for each 'treatment'. This produces the table of means given here, and the ANOVA table given below:


ANOVA table
Source of
df SS MS F-
Treatments  2  1.1877 0.5938 5.5391 0.006<
Error 57 6.1109 0.1072    
Total 70 13.2356      

Meaningful contrasts (we assume here they are preplanned) would be to first compare the two control (unaffected) groups (B & C) and then compare the affected group (A) with the mean of the two unaffected groups.

ContrastNull hypothesisCoefficients
C1 H0 : μBC 0A + 1B1C
C2 H0 : μA = (μBC)/2 1A 1/2B1/2C

Sums of squares for each of the two contrasts are given by:
SSC1  =  20((0×3.83429)+(1×3.59132)+(-1×3.50114))2
   =  0.08132

SSC2  =  20((1×3.83429)+(-0.5×3.59132)+(-0.5×3.50114))2
   =  1.10638

C1 0.081320.081320.10720.75860.387


From this we may conclude that dogs with the nerve disorder had a significantly higher concentration (P = 0.002) of serum Vitamin E than dogs without the disorder. We cannot however tell whether the higher concentration caused the disorder - or resulted from the disorder - or was related to some third factor. With an observational design the strength of inference is weak.

Worked example 2

Mean number of eggs laid per leaf
Var A Lemon
Mean=3.7 (n=5)
Var B Lemon
Mean=2.9 (n=5)
Var C Orange
Mean=1.88 (n=5)
Var D Orange
Mean=2.22 (n=5)
Var E Grapefuit
Mean=0.56 (n=5)
Var F Grapefruit
Mean=0.92 (n=5)

We use hypothetical data for our second worked example, but it is based on the study by Vercher et al. (2008) on the influence of citrus species on the number of eggs laid by the citrus leafminer. Treatments comprise six different citrus varieties, two each of lime, orange and grapefruit.

A single replicate consists of the mean number of eggs per leaf laid by five adult female moths in a cage with a young shoot after 24 h exposure. Data are shown in the table:

  1. Check assumptions for ANOVA

    Box plots are examined to assess how appropriate (parametric) ANOVA is for the set of data.

    {Fig. 1}

    There is no evidence of non-normality or heteroscadicity from the plots, but we check homogeneity of variances using Bartlett's test.

    This gives a P-value of 0.9793 so again there is no evidence that variances are not homogeneous.
  2. Perform analysis of variance

    Analysis of variance is carried out, which indicates a highly significant treatment effect (F5,24 = 25.85, P < 0.001). The mean square error is 0.271 with 24df:

  3. Assess effect sizes

    Although orthogonal contrasts should always be the method of choice, there are times when it is inappropriate (usually when an experiment has been designed without thinking in advance what alternative hypotheses one wishes to test). In this example, orthogonal contrasts have little to offer, so we will instead make all pairwise comparisons - which is what Vercher et al. (2008) did in their analysis using Tukey's honestly significant difference test.

    1. Compute the honestly significant difference
      Tukey HSD    =   4.3727
           =   1.018
    2. Rank the means and tabulate their differences. Those marked below with * are significant (P < 0.05). Precise P-values for each difference are given in the R output.

      Ranked means

    3. Use horizontal lines to underline treatments that do not differ.


    4. Conclusions: There is a general tendency for lemon varieties to have more eggs than orange varieties, and for orange varieties to have more than grapefruit varieties - but in each case there is no significant difference between the 'best' (largest number of eggs) variety of one species and the 'worst' of the species with the next largest number of eggs.


    We may decide that Tukey's test is too conservative, especially since other researcher's have shown that (for example) lemons tend to have more eggs than other citrus species. Hence we feel we can justify use of a (more liberal) test, and opt for Ryan's Q test.

    1. Compute corrected α levels (b) for each of the five values of m (number of means spanned).

      For m=6, b=1-(1-0.05)6/6=0.0500    For m=3, b=1-(1-0.05)3/6=0.0253
      For m=5, b=1-(1-0.05)5/6=0.0417    For m=2, b=1-(1-0.05)2/6=0.0169
      For m=4, b=1-(1-0.05)4/6=0.0338

    2. Compute Ryan's SSR for each of the five values of m (number of means spanned).
      Ryan's SSRm=6<    =   4.3727    =  1.018
      Ryan's SSRm=5   =  4.2850 × 0.2328   =  0.9975
      Ryan's SSRm=4   =  4.1563 × 0.2328   =  0.9676
      Ryan's SSRm=3   =  3.9749 × 0.2328   =  0.9254
      Ryan's SSRm=2   =  3.6309 × 0.2328   =  0.8453

    3. Reference to the table of differences of means above shows that the only change is that varieties C and F are now significantly different with a difference of 0.96, which is greater than the critical difference of 0.8453. If we again use horizontal lines to underline treatments that do not differ:



    A third approach (and possibly the best) would be carry out three combined mean contrasts as detailed below (mean lemons vs mean oranges, mean lemons vs mean grapefruits, mean oranges vs mean grapefruits). Note these contrasts are not orthogonal so we must use Scheffé's method to test them:

    1. Specify the contrasts of interest

      ContrastNull hypothesisCoefficients
      C1 H0 : (μA + μB)/2=(μC + μD)/2 + 1/2A + 1/2B − 1/2C − 1/2D
      C2 H0 : (μA + μB)/2=(μE + μF)/2 + 1/2A + 1/2B − 1/2E − 1/2F
      C3 H0 : (μC + μD)/2=(μE + μF)/2 + 1/2C + 1/2D − 1/2E − 1/2F

    2. Estimate each of the contrasts

      C1   =  1.85+1.45−0.94−2.22 = 1.25
      C2   =  1.85+1.45−0.28−0.46 = 2.56
      C3   =  0.94+1.11−0.28−0.46 = 1.31

    3. Calculate 95% confidence limits for each of the contrasts
      CL1    =   1.25 ± √ × √
      [5 × 2.62] 0.271 × [.05+.05+.05+.05]
           =   1.25 ± 0.8426    (0.4074 - 2.0926)
      CL2    =   2.56 ± 0.8426    (1.7174 - 3.4026)
      CL1    =   1.31 ± 0.8426    (0.4674 - 2.1526)

    4. Conclusions: In all three cases the limits exclude zero, so the contrasts are all significant. We may conclude that lemons (of the two selected varieties) have significantly more eggs laid on them than oranges, which in turn have significantly more than grapefruit.