 InfluentialPoints.com
Biology, images, analysis, design...
 Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

# Fully replicated factorial ANOVA  #### Worked example 1

Our first worked example looks at an experiment to assess whether feral pigs had become bait-shy to the poison sodium monofluoroacetate (1080). It used data from Hone & Kleba (1984) (you can find it analyzed using SAS at University of Canberra (Biometrics) )

A number of feral pigs were captured in an area where the poison 1080 had previously been used to control feral pigs and rabbits. 10 male and 10 female pigs were selected to be of approximately the same age and weight. Two male pigs were randomly allocated to each of five pens and two female pigs were randomly allocated to each of the remaining five pens.

On Day 1, all pigs were offered wheat only and their intake was recorded. On Day 2, they were offered one of the following: (a) Wheat only (b) Wheat and water (c) Wheat, water and dye (d) Wheat, water and 1080 (e) Wheat, water, dye and 1080, and intake was again recorded. The response variable was the change in intake in kg. If the pigs in the chosen area had become bait shy, it was hypothesized that intake of the mixture including bait would be reduced relative to the controls. Change in bait intake by feral pigs Gender Wheat Wheat & water Wheat, water & dye Wheat, water & 1080 Wheat, water, dye & 1080 Male 0.188 -0.058 0.050 -0.138 0.058-0.082 -0.712-1.280 -0.610 -0.830 Female -0.280 -0.062 -0.540 -0.336 -0.260 -0.123 -0.894 -0.672 -0.837 -1.202

1. #### Draw boxplots and assess normality

Plot out data to get a visual assessment of the treatment and block effects, and assess how appropriate (parametric) ANOVA is for the set of data.

There is no evidence of any gross violations of assumptions re homogeneity of variances and normality from these plots although we will reconsider this when we look at model diagnostics. We also note that both factors may be affecting the response variable.

2. #### Draw interaction plot

If there were no interaction between bait treatment and gender, the two lines for the different genders should follow the same trends and be roughly parallel.

There is no clear evidence of any interaction between bait treatment and gender, but the lines do cross so we should not be too surprised if the analysis reveals some indication of interaction between the factors. Since we have independent replicates of each combination, we can test for this - although with only two replicates for each combination, we have very little power for this test!

3. #### Calculate factor means

 Using Rtapply(resp,bait,mean) A1 A2 A3 A4 A5 -0.05300 -0.24100 -0.10175 -0.88950 -0.86975 > tapply(resp,gend,mean) F M -0.5206 -0.3414

4. #### Carry out analysis of variance

Sums of squares can be calculated manually as follows:

 SStotal = 7.24235 − −8.622 = 3.52713 20

 SSbait = −0.2122 + −0.9642 + −0.4072 +   4 4 4 −3.5582 + −3.4792 − −8.622   4 4 20 = 6.47567 - 3.71522 = 2.76045

 SSgender = −5.2062 + −3.4142 − −8.622   10 10 20 = 3.87578 − 3.71522 = 0.16056

 SSSubgrps = −0.3422 + −0.8762 + ... −1.9922 + −1.4402 − −8.622     2 2 2 2 20 = 6.853895 − 3.71522 = 3.13867

SSA × B = 3.13867 − 2.76045 − 0.16056 = 0.21766

 ANOVA table Source ofvariation df SS MS F- ratio P Bait (A) 4 2.7605 0.6091 17.77 0.0001 Gender (B) 1 0.1606 0.1606 4.13 0.0694 A × B 4 0.2177 0.0544 1.401 0.3023 Error 10 0.3885 0.0388 Total 19 288.62

Considering the interaction first, the interaction between the factors bait and sex is not significant, even if we adopt a critical value of 0.25. We note that we have very little power to detect this interaction - but we also have little or no evidence that it exists. As for the main factors, bait is significant whilst gender is borderline significant (P =0.07).

In R there are two different ways to do this analysis of variance:

• Use lm() with the model defined as model=lm(resp~B*A); then use anova(model).
• Use aov() with the model defined as model=aov(resp~B*A); then use summary(model)
Below we have used the first of these methods:

 Using R Analysis of Variance Table Response: resp Df Sum Sq Mean Sq F value Pr(>F) bait 4 2.76045 0.69011 17.7658 0.0001538 *** gend 1 0.16056 0.16056 4.1334 0.0694460 . bait:gend 4 0.21766 0.05441 1.4008 0.3022677 Residuals 10 0.38845 0.03885

The result is of course identical to the manual calculations.

5. #### Check diagnostics

We have chosen the easy option in R which is just to do plot(model) - this provides a range of diagnostic plots.

The residuals versus fitted plot (top left) shows residuals reasonably evenly spread along the line aside from three points of concern identified by numbers (14,18,19). The normal quantile plot (top right) shows errors are approximatly normal, albeit not quite. The scale-location plot some evidence of a mean error-variance relationship, in other words heteroskedasticity. Note, Logan advises us, for ANOVA, to ignore Cooks D values.

6. #### Simplify model

 Using R Analysis of Variance Table Model 1: resp ~ bait * gend Model 2: resp ~ bait + gend Res.Df RSS Df Sum of Sq F Pr(>F) 1 10 0.38845 2 14 0.60611 -4 -0.21766 1.4008 0.3023

 Using RAnalysis of Variance Table Model 1: resp ~ bait * gend Model 2: resp ~ bait + gend Res.Df RSS Df Sum of Sq F Pr(>F) 1 10 0.38845 2 14 0.60611 -4 -0.21766 1.4008 0.3023 > anova(model2) Analysis of Variance Table Response: resp Df Sum Sq Mean Sq F value Pr(>F) bait 4 2.76045 0.69011 15.9403 4.132e-05 *** gend 1 0.16056 0.16056 3.7087 0.07468 . Residuals 14 0.60611 0.04329

This analysis indicates that we can safely drop the interaction term to provide a more parsimonious model. We then recheck the reduced model's disgnostics.

7. #### Recheck diagnostics

These four plots indicate observation 14, and possibly 16 & 20, merit further attention. This does not imply they ought to be eliminated, but one should at least check the raw data - and think about why they may be telling you something different from their fellows.

One could then take the process further by first eliminating the gender effect (which is not significant) and then (possibly) combining the first 3 treatment levels (without poison) and the remaining 2 levels (with poison) to obtain the minimal adequate model.

But since the gender effect is so close to significance, we decide against this.

8. #### Simplify model using AIC

An alternative approach to model simplification is to use the 'step' function in R. This uses the Akaike Information Criterion as the criterion for model selection.

 Using RStart: AIC=-58.83 resp ~ bait * gend Df Sum of Sq RSS AIC 0.388 -58.826 - bait:gend 4 0.218 0.606 -57.929 > anova(model,model2) Analysis of Variance Table Model 1: resp ~ bait * gend Model 2: resp ~ bait * gend Res.Df RSS Df Sum of Sq F Pr(>F) 1 10 0.38845 2 10 0.38845 0 0.00000

Using this approach, all terms remain in the model including the interaction term. This is because the information criterion is much more liberal about retaining explanatory variables.

 Except where otherwise specified, all text and images on this page are copyright InfluentialPoints, all rights reserved. Images not copyright InfluentialPoints credit their source on web-pages attached via hypertext links from those images.