 InfluentialPoints.com
Biology, images, analysis, design...
 Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

# Split-plot and repeated measures ANOVA  #### Worked example 1

Our first worked example looks at a long term experiment to assess the effects of nitrogen application and thatch accumulation on chlorophyll content of grass (you can find it analyzed using Minitab by Stephen Arnold, Penn State University) )

The experiment was laid out as a split-plot design with two blocks (replications) . Each block contained four main plots each of which contained 3 subplots. The four levels of nitrogen application were randomly allocated to the main plots within each block. The three levels of thatch accumulation (2, 5 and 8 years) were randomly allocated to the subplots within each plot. It is unclear why a split plot design was used for this - Arnold suggests it may be to avoid the frertilizer blowing over onto other plots. Small problem with design - because thatch accumulation depends on time this aspect is esssentially unreplicated - depends on particular conditions during thise periods. Theoretically would be better to use a staggered start - but then of course the experiment would take even longer to complete!! Effect of nitrogen application and thatchaccumulation on chlorophyll content of grass Nitrogen Date Blocks B1 B2 N1 D1 3.8 3.9 D2 5.3 5.4 D3 5.9 4.3 N2 D1 5.2 6.0 D2 5.6 6.1 D3 5.4 6.2 N3 D1 6.0 7.0 D2 5.6 6.4 D3 7.8 7.8 N4 D1 6.8 7.9 D2 8.6 8.6 D3 8.5 8.4

1. #### Draw boxplots and assess normality

Plot out data to get a visual assessment of the treatment and block effects, and assess how appropriate (parametric) ANOVA is for the set of data.
2. #### Draw interaction plots

If there were no interaction between nitrogen treatment and date, the three lines for the different dates should follow the same trends and be roughly parallel.
3. #### Get table of means

 Using Rtapply(chlo,nitr,mean) n1 n2 n3 n4 4.766667 5.750000 6.766667 8.133333 > tapply(chlo,date,mean) d1 d2 d3 5.8250 6.4500 6.7875 > tapply(chlo,blck,mean) b1 b2 6.208333 6.500000

4. #### Carry out analysis of variance

Sums of squares can be calculated manually as follows:

 SStotal = 1017.79 −− 152.52 = 48.7796 24

 SSblocks(S) = 74.52 + 782 − 152.52   12 12 24 = 0.510416

 SSnitr(A) = 28.62 + 34.52 + 40.62 + 48.82 − 152.52     6 6 6 6 24 = 37.3246

 SSSubgrps (A×S) = 15.02 + 16.22 + ... 24.92 − 152.52    3 3 3 24 = 39.09292

SSMain plot error = 39.0929 − 37.3246 − 0.5104 = 1.2579

 SSdate(B) = 46.62 + 51.62 + 54.32 − 152.52    8 8 8 24 = 3.8158

 SSSubgrps (A×B) = 7.72 + 11.22 + ... 16.92 − 152.52    2 2 2 24 = 45.2946

SSA ×B = 45.2946 − 37.3246 − 3.8158 = 4.1542

SSSubplot error = 48.7796 − 0.5104 − 37.3246 − 1.2579 − 3.8158 − 4.1542 = 1.7167

 ANOVA table Source ofvariation df SS MS F- ratio P Blocks (S) 1 0.5104 0.5104 Nitrogen (A) 3 37.3246 12.4415 29.67 0.0099 Main plot error 3 1.2579 0.4193 Dates (B) 2 3.8158 1.9079 8.89 0.0093 A × B 6 4.1542 0.6924 3.23 0.0646 Subplot Error 8 1.7167 0.2146 Total 23 48.7796

 Using R Error: blck Df Sum Sq Mean Sq F value Pr(>F) Residuals 1 0.51042 0.51042 Error: blck:nitr Df Sum Sq Mean Sq F value Pr(>F) nitr 3 37.325 12.442 29.672 0.009896 ** Residuals 3 1.258 0.419 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: blck:nitr:date Df Sum Sq Mean Sq F value Pr(>F) date 2 3.8158 1.9079 8.8913 0.00927 ** nitr:date 6 4.1542 0.6924 3.2265 0.06460 . Residuals 8 1.7167 0.2146 ---

5. #### Check diagnostics

An immediate difficulty arises with checking diagnostics of this split-plot design - if you try plotting out the diagnostics using R, you will simply get the word NULL. This is not because you have done anything wrong! It is simply the result of having only one observation for each of the block × nitrogen × date combinations. The model you have fitted is known as the saturated model.

You can get an overview of the situation by fitting the general linear model and assuming that interactions with blocks are non-existent. The sums of squares for all main effects and the A × B interaction will still be correct, and you can assess the various diagnostics. Note, however, that all the F-ratios (and associated P-values) are incorrect because they are no longer using the correct error term.

 Using RResponse: chlo Df Sum Sq Mean Sq F value Pr(>F) nitr 3 37.325 12.442 46.0087 1.627e-06 *** date 2 3.816 1.908 7.0555 0.01068 * blck 1 0.510 0.510 1.8875 0.19683 nitr:date 6 4.154 0.692 2.5603 0.08396 . Residuals 11 2.975 0.270 ----

A better approach to diagnostics with a split-plot design is to consider diagnostics separately for assessment of factor A (between mainplots) and for factor B and A × B (within mainplots).

##### Between mainplots

 Using RResponse: chlo Df Sum Sq Mean Sq F value Pr(>F) nitr 3 12.4539 4.1513 29.5239 0.009967 ** blck 1 0.1691 0.1691 1.2024 0.352972 Residuals 3 0.4218 0.1406 ---

#### Worked example 2

We take our second example from Ogata & Takeuchi et al (2001) on a trial of a feline pheromone analogue to reduce the frequency of urine marking by cats. We previously compared the number of markings pre-treatment and one week post-treatment using the non-parametric Wilcoxon's matched pairs signed ranks test. But the authors also used repeated measures analysis of variance to examine the different courses of urine marking over time relative to aggression status. No. urine markings pre and post treatment No. Sex Agg Pre W1 W2 W3 W4 W5 W6 W7 W8 01 m n 7 0 0 0 0 0 1 3 5 02 m n 12 10 9 9 9 9 9 9 9 07 m n 2 1 2 2 0 0 0 1 0 08 m y 63 8 8 3 5 6 8 10 6 09 m y 10 7 4 4 3 3 4 3 3 10 m y 16 14 8 8 6 5 9 9 11 11 m n 6 6 3 2 2 2 1 2 3 13 m y 18 9 5 2 8 9 7 7 16 14 m y 18 9 3 1 0 0 0 0 0 16 m y 9 7 5 7 4 5 6 5 6 20 m y 13 14 14 10 9 9 10 13 14 21 m y 7 2 4 0 2 1 1 0 0 22 m y 28 2 0 1 0 11 5 3 2 26 f n 3 0 0 0 0 0 0 0 0 27 f n 7 0 0 0 0 0 3 3 5 28 f n 8 9 4 2 1 1 1 2 0 29 f n 6 6 3 2 1 1 1 3 4 32 f y 13 10 9 7 11 8 9 9 11 33 f y 1 0 2 1 3 3 1 0 0 34 f n 3 3 2 2 1 2 2 1 1 35 f n 17 14 10 7 7 3 4 4 3

The data for multicat households (where aggression can be assessed) are given below (complete data sets only).

One's first reaction may be (or perhaps should be) that non-parametric analysis would be a much wiser approach given the patently non-normal distribution of the response variable. However, we will attempt an analysis after a transformation.

1. #### Draw boxplots and assess normality

Plot out data to get a visual assessment of the treatment and block effects, and assess how appropriate (parametric) ANOVA is for the set of data.

As expected for count data, the distribution of the raw data within groups does not approximate to normal - instead the distributions are right skewed. We try a square root transformation (or to be more precise a √(Y + 0.5) transform given the large number of zeros) as a possible normalizing function for small whole numbered counts.

This looks more hopeful - most of the groups have more or less symetrical distributions, albeit still with a few high outliers. A log transformation brought the high outliers down a little more, but at the cost of making several distributions left-skewed. Hence we proceed with the analysis on the square root transformed data bearing in mind we need to examine diagnostics carefully after model fitting. The interaction plot for aggression and week suggests similar trends over time for both aggressive and non-aggressive cats.

2. #### Get table of means

 Using R round(tapply(sqrt(ur+0.5),id,mean),3) s01 s02 s07 s08 s09 s10 s11 s13 s14 s16 s20 s21 s22 1.302 3.150 1.113 3.252 2.203 3.128 1.821 2.983 1.557 2.535 3.491 1.399 2.083 s26 s27 s28 s29 s32 s33 s34 s35 0.836 1.373 1.740 1.802 3.178 1.235 1.527 2.744 > round(tapply(sqrt(ur+0.5),ag,mean),3) n y 1.741 2.459 > round(tapply(sqrt(ur+0.5),wk,mean),3) w0 w1 w2 w3 w4 w5 w6 w7 w8 3.340 2.368 2.061 1.783 1.764 1.847 1.914 1.960 2.015

3. #### Carry out analysis of variance

Sums of squares can be calculated manually as follows:

 SStotal = 1075.5 − 400.07762 189 = 228.6107 SSsubjects (S) = 11.71492 + ... 24.69312 − 400.07762   9 9 189 = 124.7742 SSaggression (A) = 156.67982 + 243.39782 − 400.07762   90 99 189 = 24.2815 SSS(A) = 124.7742 − 24.2815 = 100.493 SSweek(B) = 70.93932 + ... 42.30622 − 400.07762   9 9 189 = 40.897 SSSubgrps (A×B) = 26.53342 + ... 25.32452 − 400.07762    10 11 189 = 68.5015 SSA ×B = 68.5015 − 24.2815 − 40.897 = 3.323 SSresidual = 228.6107− 24.2815 − 40.897 − 100.493 − 40.897 − 3.323 = 59.616

 ANOVA table Source ofvariation df SS MS F- ratio P Aggression (A) 1 24.2815 24.2815 4.5908 0.0453 Subjects w'in aggression 19 100.493 5.2891 Weeks (B) 8 40.897 5.1121 13.0431 < 0.0001 A × B 8 3.323 0.4154 1.0591 0.3948 Residuals 152 59.616 0.3922 Total 188 228.6107

 Using RError: id Df Sum Sq Mean Sq F value Pr(>F) ag 1 24.281 24.2814 4.5908 0.04531 * Residuals 19 100.493 5.2891 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Error: id:wk Df Sum Sq Mean Sq F value Pr(>F) wk 8 40.897 5.1121 13.0341 3.236e-14 *** ag:wk 8 3.323 0.4154 1.0591 0.3948 Residuals 152 59.616 0.3922

4. #### Check diagnostics

As with the split-plot design we consider diagnostics separately for assessment of the treatment factor A (between subjects) and for time and treatment × time (within subjects).

##### Between subjects

 Using RAnalysis of Variance Table Response: squr Df Sum Sq Mean Sq F value Pr(>F) ag 1 2.6984 2.69845 4.5901 0.04532 * Residuals 19 11.1698 0.58788 ---

We will leave it to you to extract the second set of residuals and assess them.

 Except where otherwise specified, all text and images on this page are copyright InfluentialPoints, all rights reserved. Images not copyright InfluentialPoints credit their source on web-pages attached via hypertext links from those images.