Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Split plot & repeated measures ANOVA: Use & misuse
(partially nested designs, analysis of variance, interactions confounded, subjects × trials, subjects × treatments, sphericity, linear mixed effects model)
Statistics courses, especially for biologists, assume formulae = understanding and teach how to do statistics, but largely ignore what those procedures assume, and how their results mislead when those assumptions are unreasonable. The resulting misuse is, shall we say, predictable...
Use and Misuse
We deal with split plot and repeated measures designs together because they can both be described as partially nested designs. Both types of designs are commonly analyzed with the same family of linear models. Hence you may find data from a repeated measures design being analyzed with a 'split plot' analysis of variance. Despite the use of the same family of models, there are some important differences between split-plot and repeated measures designs especially in relation to randomization and assumptions.
The principle of a split-plot design is that different treatments are assigned randomly to sampling units at different scales. So levels of factor A are assigned to mainplots (usually termed blocks), whilst levels of factor B are assigned to plots within each block. Levels of factor C may be assigned to subplots within each plot - and so on... There is commonly only one observation of each treatment combination within a particular sampling unit. This means that interactions between the treatment factors and sampling units are confounded (cannot be tested), and has led to criticism of this design by some statisticians.
In repeated measures designs, repeated measurements are made on the same experimental unit at successive points in time. The first type is subjects × trials designs where repeated measurements are made at the same times on groups of individuals receiving different treatments. Here time is a factor in the experiment and order of that factor cannot be randomized. Analysis of this design is identical to the split plots design with subjects equal to blocks - but there is no randomization to factor B (time period). The second type is the subjects × treatments design which includes the two period crossover design and the Latin squares repeated measures design. In these designs observations on the same individuals in a time series are often correlated. In this case a further assumption must be met for ANOVA, namely that of compound symmetry or sphericity. Sphericity holds when the variances of the differences between treatment levels are homogeneous.
Repeated measures ANOVA is still widely used in many disciplines including the medical sciences, although in recent years the linear mixed effects model has replaced ANOVA's former predominance for repeated measures analysis.
The assumption of sphericity (or compound symmetry) was generally correctly checked in the studies we looked at, both in the medical (effectiveness of repeated four-monthly albendazole treatments in young children), and ecological (substratum preference of a burrowing isopod) papers. This was less true for veterinary studies (for example on the effect of growth promoters on hormone levels in cattle), where the matter was sometimes ignored despite treatment order not being randomized. Where there is no randomization of treatment order or where time itself is the repeated measures factor, then sphericity really must be assessed or the Type I error rate will be inflated. Even where sphericity was correctly checked, the other ANOVA assumptions may not have been checked. This is especially a problem when ANOVA is used for ordinal data (as is often the case in medical research).
Interpretation of interactions with time seems to pose bigger challenges to the struggling research worker. If there is a significant interaction between time and treatment, then one should only compare treatments within time. Yet in one experiment on the efficacy of yoga for reducing anxiety it was concluded that there was an overall decrease over time, even when this was only the case for the yoga group. A similar problem arose in a study of the effects of repeated albendazole treatments in young children where overall means were compared despite treated and untreated groups starting from the same baseline. When interpreting ANOVA results, one should always start with the interactions - not the main effects. But as we noted in our comments on factorial ANOVA, there is little awareness that the test for interaction has much lower power than the tests for main effects. For example in a veterinary trial on the effects of a probiotic supplement on weight gain by goats, an interaction was ignored despite the very small sample sizes.
The linear mixed effects model is being increasingly used - although how well it is being used is another matter. One should specify which covariance structure is being used in the mixed model, yet in neither of the two examples we found was this done. In one study on the effects of sweetners on feed intake of pigs, the authors failed to demonstrate a significant interaction between treatment and time (P = 0.717), yet did identify significant differences between treatments on some days. Their conclusion was that 'sweetners only affect feed intake characteristics to a limited extent' when in fact they had failed to show any effect.
What the statisticians sayDavis (2002) provides a comprehensive account of methods for analysis of repeated measures data. Older texts specifically on analysis of repeated measures include Crowder & Hand (1990) and Hand & Taylor (1987) . Analysis of repeated measures using ANOVa, MANOVA and the linear mixed effects model using R is covered by Logan (2010) and Crawley (2007), (2005). Doncaster & Davey (2007) consider split-plot and repeated measures designs in Chapters 5 & 6. Further texts for ecological researchers include Quinn & Keough (2002) in Chapter 11, and Underwood (1997) in Chapter 12.
Sullivan (2008) & Kusuoka & Hoffman (2002) give useful advice for medical researchers on the use of ANOVA in circulation research. Fitzmaurice & Ravichandran (2008) provide an excellent primer on the use of ANOVA and the linear mixed effects model for repeated measures data. Frison & Pocock (2007) recommend analysis of repeated measures in clinical trials using mean summary statistics and analysis of covariance. Vickers (2005) criticizes the use of analysis of variance (especially repeated measures) in the analysis of randomized trials. Gueorguieva & Krystal (2004) and Quené & van den Bergh (2004) advocate the use of linear mixed effects models rather than ANOVA for analyzing repeated-measures data.
Keselman et al. (2001) and Keselman (1998) review the analysis of repeated measures designs in the behavioural sciences. Looney & Stanley (1989) and Vasey & Thayer (1987) focus on the use of adjusted univariate ANOVA and MANOVA for repeated measures Early work on analysis of repeated-measures designs in the behavioural sciences includes McCall & Appelbaum (1973) and Rouanet & Lépine (1970). Smart et al. (2008) give advice on the statistical analysis of crossover experiments in aquaculture research. Loughin (2007) considers improved experimental design and analysis for long-term experiments. St-Pierre (2006) explains why pen studies have an implicit split-plot design in which the main plots (pens) receive the treatment of interest, whereas the subplots (cows) receive all the same subplot treatment. Pilla (2005) proposes use of a split-plot design for laboratory experiments run using restricted randomization of replicates within block. Paterson & Lello (2003) review the use of mixed models for analyzing repeated measures parasitological data.
Wilcox et al. (2000) and Keselman et al. (2000) examine robust approaches to repeated measures ANOVA using trimmed means and bootstrapping. Littell et al. (1998) looks at the statistical analysis of repeated measures data using SAS procedures. Littell (2002) compares the use of ANOVA versus likelihood-based methods for unbalanced mixed model data. Field (1998) provides a bluffer's guide to sphericity. Scariano & Davenport (1987) describe the effects that departures from independence have on hypothesis testing in analysis of variance. Patterson (1951) reviews change-over trials.
Wikipedia provides sections on restricted randomization, and Mauchley's test for sphericity. Tom Baguley provides an excellent introduction to compound symmetry and sphericity. Andy Field looks at the joys of sphericity and analysis of repeated measures in SAS. Crossover designs are covered by Gerard Dallal and R.I. Cue