InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 

Split-plot and repeated measures ANOVA

Worked example 1

Our first worked example looks at a long term experiment to assess the effects of nitrogen application and thatch accumulation on chlorophyll content of grass (you can find it analyzed using Minitab by Stephen Arnold, Penn State University))

The experiment was laid out as a split-plot design with two blocks (replications) . Each block contained four main plots each of which contained 3 subplots. The four levels of nitrogen application were randomly allocated to the main plots within each block. The three levels of thatch accumulation (2, 5 and 8 years) were randomly allocated to the subplots within each plot. It is unclear why a split plot design was used for this - Arnold suggests it may be to avoid the frertilizer blowing over onto other plots. Small problem with design - because thatch accumulation depends on time this aspect is esssentially unreplicated - depends on particular conditions during thise periods. Theoretically would be better to use a staggered start - but then of course the experiment would take even longer to complete!!

Effect of nitrogen application and thatch
accumulation on chlorophyll content of grass
NitrogenDateBlocks
B1B2
N1D13.83.9
D25.35.4
D35.94.3
N2D15.26.0
D25.66.1
D35.46.2
N3D16.07.0
D25.66.4
D37.87.8
N4D16.87.9
D28.68.6
D38.58.4

  1. Draw boxplots and assess normality

    Plot out data to get a visual assessment of the treatment and block effects, and assess how appropriate (parametric) ANOVA is for the set of data.

    Figme1.gif

  2. Draw interaction plots

    If there were no interaction between nitrogen treatment and date, the three lines for the different dates should follow the same trends and be roughly parallel.

    Figme2.gif

  3. Get table of means

    Using R
    tapply(chlo,nitr,mean)
          n1       n2       n3       n4
    4.766667 5.750000 6.766667 8.133333
    > tapply(chlo,date,mean)
        d1     d2     d3
    5.8250 6.4500 6.7875
    > tapply(chlo,blck,mean)
          b1       b2
    6.208333 6.500000 

  4. Carry out analysis of variance

    Sums of squares can be calculated manually as follows:

    SStotal  =  1017.79  −−  152.52  =  48.7796
    24

    SSblocks(S)  =  74.52  +   782  −  152.52
    121224
       =  0.510416

    SSnitr(A)  =  28.62  +  34.52  +  40.62  +  48.82  −  152.52  
    666624
       =  37.3246

    SSSubgrps (AS)  =  15.02  +  16.22  +   ...  24.92  −  152.52  
    33324
        = 39.09292

    SSMain plot error = 39.0929 − 37.3246 − 0.5104 = 1.2579

    SSdate(B)  =  46.62  +  51.62  +  54.32  −  152.52  
    88824
       =  3.8158

    SSSubgrps (AB)  =  7.72  +  11.22  +   ...  16.92  −  152.52
    22224
        = 45.2946

    SSA B = 45.2946 − 37.3246 − 3.8158 = 4.1542

    SSSubplot error = 48.7796 − 0.5104 − 37.3246 − 1.2579 − 3.8158 − 4.1542 = 1.7167

    ANOVA table
    Source of
    variation
    df SS MS F-
    ratio
    P
    Blocks (S) 1 0.5104 0.5104    
    Nitrogen (A)  3 37.3246 12.4415 29.67 0.0099
    Main plot error 3 1.2579 0.4193    
    Dates (B) 2 3.8158 1.9079 8.89 0.0093
    A B 6 4.1542 0.6924 3.23 0.0646
    Subplot Error 8 1.7167 0.2146    
    Total 23 48.7796      

    Using R
     Error: blck
              Df  Sum Sq Mean Sq F value Pr(>F)
    Residuals  1 0.51042 0.51042
    
    Error: blck:nitr
              Df Sum Sq Mean Sq F value   Pr(>F)
    nitr       3 37.325  12.442  29.672 0.009896 **
    Residuals  3  1.258   0.419
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Error: blck:nitr:date
              Df Sum Sq Mean Sq F value  Pr(>F)
    date       2 3.8158  1.9079  8.8913 0.00927 **
    nitr:date  6 4.1542  0.6924  3.2265 0.06460 .
    Residuals  8 1.7167  0.2146
    ---

  5. Check diagnostics

    An immediate difficulty arises with checking diagnostics of this split-plot design - if you try plotting out the diagnostics using R, you will simply get the word NULL. This is not because you have done anything wrong! It is simply the result of having only one observation for each of the block × nitrogen × date combinations. The model you have fitted is known as the saturated model.

    You can get an overview of the situation by fitting the general linear model and assuming that interactions with blocks are non-existent. The sums of squares for all main effects and the A × B interaction will still be correct, and you can assess the various diagnostics. Note, however, that all the F-ratios (and associated P-values) are incorrect because they are no longer using the correct error term.

    Using R
    Response: chlo
              Df Sum Sq Mean Sq F value    Pr(>F)
    nitr       3 37.325  12.442 46.0087 1.627e-06 ***
    date       2  3.816   1.908  7.0555   0.01068 *
    blck       1  0.510   0.510  1.8875   0.19683
    nitr:date  6  4.154   0.692  2.5603   0.08396 .
    Residuals 11  2.975   0.270
    ----

    Figme3.gif

    A better approach to diagnostics with a split-plot design is to consider diagnostics separately for assessment of factor A (between mainplots) and for factor B and A × B (within mainplots).

    Between mainplots

    Using R
    Response: chlo
              Df  Sum Sq Mean Sq F value   Pr(>F)
    nitr       3 12.4539  4.1513 29.5239 0.009967 **
    blck       1  0.1691  0.1691  1.2024 0.352972
    Residuals  3  0.4218  0.1406
    ---

    Figme4.gif

Worked example 2

No. urine markings pre and post treatment
No.SexAggPreW1W2W3W4W5W6W7W8
01mn700000135
02mn12109999999
07mn212200010
08my63883568106
09my1074433433
10my161488659911
11mn663222123
13my18952897716
14my1893100000
16my975745656
20my1314141099101314
21my724021100
22my28201011532
26fn300000000
27fn700000335
28fn894211120
29fn663211134
32fy1310971189911
33fy102133100
34fn332212211
35fn171410773443

We take our second example from Ogata & Takeuchi et al (2001) on a trial of a feline pheromone analogue to reduce the frequency of urine marking by cats. We previously compared the number of markings pre-treatment and one week post-treatment using the non-parametric Wilcoxon's matched pairs signed ranks test. But the authors also used repeated measures analysis of variance to examine the different courses of urine marking over time relative to aggression status. The data for multicat households (where aggression can be assessed) are given below (complete data sets only).

One's first reaction may be (or perhaps should be) that non-parametric analysis would be a much wiser approach given the patently non-normal distribution of the response variable. However, we will attempt an analysis after a transformation.

  1. Draw boxplots and assess normality

    Plot out data to get a visual assessment of the treatment and block effects, and assess how appropriate (parametric) ANOVA is for the set of data.

    Figme6.gif

    As expected for count data, the distribution of the raw data within groups does not approximate to normal - instead the distributions are right skewed. We try a square root transformation (or to be more precise a √(Y + 0.5) transform given the large number of zeros) as a possible normalizing function for small whole numbered counts.

    Figme7.gif

    This looks more hopeful - most of the groups have more or less symetrical distributions, albeit still with a few high outliers. A log transformation brought the high outliers down a little more, but at the cost of making several distributions left-skewed. Hence we proceed with the analysis on the square root transformed data bearing in mind we need to examine diagnostics carefully after model fitting. The interaction plot for aggression and week suggests similar trends over time for both aggressive and non-aggressive cats.

  2. Get table of means

    Using R
     round(tapply(sqrt(ur+0.5),id,mean),3)
      s01   s02   s07   s08   s09   s10   s11   s13   s14   s16   s20   s21   s22
    1.302 3.150 1.113 3.252 2.203 3.128 1.821 2.983 1.557 2.535 3.491 1.399 2.083
      s26   s27   s28   s29   s32   s33   s34   s35
    0.836 1.373 1.740 1.802 3.178 1.235 1.527 2.744
    > round(tapply(sqrt(ur+0.5),ag,mean),3)
        n     y
    1.741 2.459
    > round(tapply(sqrt(ur+0.5),wk,mean),3)
       w0    w1    w2    w3    w4    w5    w6    w7    w8
    3.340 2.368 2.061 1.783 1.764 1.847 1.914 1.960 2.015

  3. Carry out analysis of variance

    Sums of squares can be calculated manually as follows:

    SStotal  =  1075.5  −  400.07762
    189
       =   228.6107
    SSsubjects (S)  =  11.71492  + ...  24.69312  −  400.07762
    99189
        =   124.7742
    SSaggression (A)   =   156.67982  +  243.39782  −  400.07762
    9099189
       =  24.2815
    SSS(A)   =   124.7742 − 24.2815
       =  100.493
    SSweek(B)   =   70.93932  + ...  42.30622  −  400.07762
    99189
       =   40.897
    SSSubgrps (AB)   =   26.53342  + ...  25.32452  −  400.07762
    1011189
       =   68.5015
    SSA B   =   68.5015 − 24.2815 − 40.897
       =   3.323
    SSresidual   =   228.6107− 24.2815 − 40.897 − 100.493 − 40.897 − 3.323
       =   59.616

    ANOVA table
    Source of
    variation
    df SS MS F-
    ratio
    P
    Aggression (A)  1 24.2815 24.2815 4.5908 0.0453
    Subjects w'in aggression 19 100.493 5.2891    
    Weeks (B) 8 40.897 5.1121 13.0431 < 0.0001
    A B 8 3.323 0.4154 1.0591 0.3948
    Residuals 152 59.616 0.3922    
    Total 188 228.6107      

    Using R
    Error: id
              Df  Sum Sq Mean Sq F value  Pr(>F)
    ag         1  24.281 24.2814  4.5908 0.04531 *
    Residuals 19 100.493  5.2891
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Error: id:wk
               Df Sum Sq Mean Sq F value    Pr(>F)
    wk          8 40.897  5.1121 13.0341 3.236e-14 ***
    ag:wk       8  3.323  0.4154  1.0591    0.3948
    Residuals 152 59.616  0.3922 

  4. Check diagnostics

    As with the split-plot design we consider diagnostics separately for assessment of the treatment factor A (between subjects) and for time and treatment × time (within subjects).

    Between subjects

    Using R
    Analysis of Variance Table
    
    Response: squr
              Df  Sum Sq Mean Sq F value  Pr(>F)
    ag         1  2.6984 2.69845  4.5901 0.04532 *
    Residuals 19 11.1698 0.58788
    ---

    Figme8.gif

    We will leave it to you to extract the second set of residuals and assess them.