Carry out analysis of covariance
Calculating sums of squares manually for an analysis of covariance can only be described as a pain in
the backside  and one needs to be very organized! Values for the various sums and sums of squares can
be obtained with R. We first calculate
the sums of squares assuming a pooled regression of Y on X:
SS_{Total}
 =
 582800  3200^{2}/18
 =
 13911.11

SS_{Treatment} 
= 
[990^{2} + 995^{2} + 1215^{2}]/6 −
3200^{2}/18
 =
 5502.8

SS_{PooledReg}
 =
 [632800  (3530 × 3200)/18]^{2
}  =
 2856.8 

701900 (3530^{2}/18)

SS_{Error}
 =
 13911.11− 2856.8 − 5502.8
 =
 5552.31
  
Sums of squares Table 
 Σ sums 
Σ^{2} products 
ΣΣ^{2} totals 
x_{1} y_{1} x_{1}y_{1} Σx_{1}Σy_{1} x_{2} y_{2} x_{2}y
_{2} Σx_{2}Σy_{2} x_{3} y_{3} x_{3}y
_{3} Σx_{3}Σy_{3} X overall Y overall XY overall ΣXΣY
 1190 990 1140 995 1200 12
15 3530 3200
 238750 167600 199150 219900 166275 188900 243250 248925 244750 701900 582800 632800
 1178100 1134300 1458000 11296000
  
Next calculate the regression statistics for each treatment separately:
SS_{Tot1
}  =
 167600  990^{2}/6
 =
 4250

SS_{XY1}
 =
 199150  (1190 × 990)/6
 =
 2800

SS_{X1}
 =
 238750 (1190^{2}/6)
 =
 2733.33

SS_{Reg1}
 =
 2800^{2} / 2733.33
 =
 2868.296

SS_{Error1}
 =
 4250− 2868.296
 =
 1381.704

b_{1} 
= 
2800/2733.33
 =
 1.0244
  
SS_{Total2
}  =
 166275  995^{2}/6
 =
 1270.833

SS_{XY2}
 =
 188900  (1140 × 995)/6
 =
 − 150

SS_{X2}
 =
 219900 (1140^{2}/6)
 =
 3300

SS_{Reg2}
 =
 − 150^{2} / 3300
 =
 6.818

SS_{Error2}
 =
 1270.833−6.8182
 =
 1264.015

b_{2} 
= 
− 150/3300
 =
 − 0.04545
  
SS_{Total3
}  =
 248925  1215^{2}/6
 =
 2887.5

SS_{XY3}
 =
 244750  (1200 × 1215)/6
 =
 1750

SS_{X3}
 =
 243250 (1200^{2}/6)
 =
 3250

SS_{Reg3}
 =
 1750^{2} / 3250
 =
 942.308

SS_{Error3}
 =
 2887.5−942.308
 =
 19451.192

b_{3} 
= 
1750/3250
 =
 0.5385
  
Summed regression statistics:
SS_{Total
}  =
 4250 + 1270.833 + 2887.5
 =
 8408.333

SS_{XY}
 =
 2800  150 + 1750
 =
 4400

SS_{X}
 =
 2733.33 + 3300 + 3250
 =
 9283.3

SS_{reg}
 =
 2868.296 + 6.818 + 942.308
 =
 3817.422

SS_{Error}
 =
 1381.704 + 1264.015 + 1945.192
 =
 4590.911

b_{common} 
= 
4400/9283.3
 =
 0.4740
  
SS_{Reg (common slope)} = 4400^{2}/9283.3 = 2085.465
SS_{Heterogeneity of slopes} = 3817.422  2085.465 = 1731.957. I
Model with interaction (maximal model, different slopes)
ANOVA table

Source of variation
 df
 SS
 MS
 F ratio
 P

Diet
 2
 5502.8




Pre
 1
 2085.5




Heterogeneity of slopes
 2
 1732.0
 866.0
 2.2635
 0.1465

Residual
 12
 4590.9
 382.6



Total
 17
 13911.11



  
Interpretation here is somewhat problematical. At the P = 0.05 level the interaction is not
significant  in other words we can accept the lines as parallel. However, this in direct contradiction to our
visual assessment of the relationship between the response variable and the covariate, where we lack
evidence of any regression relationship at all between 'post' and 'pre' for treatment B  let alone a similar
slope to that found for treatments A and C. The apparent contradiction may
result from the lack of power of tests for interaction. Or it may result from
random variation in treatment B. Which is the case is a matter of judgement (or guesswork!), but for the
sake of the example we will assume it results from random variation.
When we come to doing it in R, we run into the same 'challenge' that we did when analyzing
unbalanced factorial designs. Because the levels of the covariate are different for each of the three
treatments, we no longer have a balanced or orthogonal design. For a nonorthogonal design, it matters which variable is entered first. To be more precise, it affects sums of
squares and significance levels of the main effects and all but the highest level interaction effect. It does
not, however, affect the SS residual nor the parameter estimates and standard errors. If you want the
results identical to the manual version, then you have to enter the (nominal) treatment factor first followed
by the (continuous) covariate.
The alternative is to use Type II sums of squares which are calculated according to the principle of
marginality. Adjustments to the sums of squares are made for the other main effects in the statistical model
but not for higherlevel interaction effects.
Irrespective of these considerations, we still have to decide what to do about the interaction. Given
that it is not significant at P=0.05, the conventional approach would be to drop the interaction and
assume parallel lines. We do this below and estimate adjusted means.
Model without interaction (parallel lines)
The diet factor remains significant (P = 0.0125) but 'pre' is getting very borderline with a P value of 0.0496. If we do a type II ANOVA using the R car package (so order has no effect) we get exactly the same P value for 'pre' but a slightly higher value for the diet factor (P = 0.0200) (we leave you to check this).
Adjusted means
The adjusted means ( ') are given by :
'_{1} = 165 − 0.474 (198.333 − 196.111) = 163.947
'_{2} = 165.833 − 0.474 (190 − 196.111) = 168.730
'_{3} = 202.5 − 0.474 (200 − 196.111) = 200.657
These then are the adjusted means post ANCOVA. Post hoc comparison of means reveals the level with diet a3 is significantly higher than with either of the other two diets.
So in conclusion was it worthwhile (or even valid) to do a covariance analysis. In this case we would say no  simply because the authors failed to demonstrate a clear relationship between pre and post values which is a precondition for using covariance analaysis. In the event it had little effect on the outcome  other than giving oneself a great deal more work!