Principles
Analysis of covariance (ANCOVA) combines the techniques of analysis of variance and regression by incorporating both nominal variables (factors) and continuous measurement variables (covariates) into a single model. We will only look in detail at its use in the completely randomized design, but it can be used with all the designs that we have covered including the randomized block, Latin square, factorial and repeated measures designs.
The primary use of ANCOVA is to increase precision in randomized experiments. A covariate X is measured on each experimental unit before treatment is applied. That covariate may be the baseline level of the response variable, or it may be some other characteristic of the experimental unit that is expected to affect outcome. The eventual treatment means are then adjusted to remove the initial differences, thus reducing the experimental error and permitting a more precise comparison among treatments. Such adjustment is dependent upon the parallel slopes assumption  namely that the slope of the relationship between the covariate and the response variable is the same for each treatment level.
The other main use of ANCOVA is to model relationships especially where one wants to compare regression relationships at different levels of some treatment variable  for example growth rates (size against age). In this use the aim is to assess which model is most appropriate to describe the data  whether separate slopes for each treatment level, a common slope but different intercepts or a common intercept but different slopes. This use clearly does not depend on the parallel slopes assumption  that is simply one of the model simplifications that can be made if justified  and because of this some authorities do not include such an analysis under the name 'analysis of covariance'.
A third somewhat controversial use of ANCOVA is to adjust for sources of bias in observational studies.
Models
One way ANCOVA
Maximal model (separate regression lines)
Parallel lines model
Computational formulae
We will take a completely randomized experimental design with 'a' group (= treatment) levels, each
replicated n times. A response variable Y and a covariate X are measured on each experimental unit.
Group (treatment) totals are denoted as T_{1} to T_{a}, and the grand total as G.
Step 1. Overall Pooled Regression
The total, treatment, regression and residual sums of squares using a pooled overall regression are
calculated as follows:
SS_{Total} 
=
 ΣY_{ij}^{2} −
(G)^{2}/N 
where:
 SS_{Total} is the total sums of squares, Y_{ij} are the individual
observations, G is the grand total or ΣY, and N is the total number of
observations.
SS_{Treatment
}  =  Σ(T_{i}^{2})/n
−G^{2}/N

where:
 SS_{Treatment} is the treatment sums of squares, T_{i} are the treatment
totals, n is the number of replicates per treatment, G is the grand total or ΣY,
and N is the total number of observations.
SS_{Regression} 
= 
[ΣXY − (ΣX)
(ΣY)/N]^{2} 

ΣX_{ij}^{2} − (ΣX)^{2}/N

where
 SS_{Regression} is the sums of squares explained by the regression, and N is the
total number of observations.
SS_{Error} 
= 
SS_{Total} − SS_{Regression} 
where
 SS_{Error} is the error or residual sums of squares

Step 2. Individual regressions
Regression statistics for each treatment level are calculated as follows.
SS_{Total} 
=
 ΣY_{1}^{2} −
(G_{1})^{2}/n 
where:
 SS_{Total (1)} is the total sums of squares, Y_{1} are the individual
observations, G_{1} is the grand total (or ΣY_{1}) and n
is the number of observations for the first treatment level.
SS_{XY (1)} 
= 
ΣX_{1}Y_{1} − (ΣX_{1})(ΣY_{1})/n 
where
 SS_{XY (1)} is the covariance sums of squares, and n is the number of
observations for the first treatment level.
SS_{X (1)} 
= 
ΣX_{1}^{2}− (ΣX_{1})^{2}/n 
where
 SS_{X (1)} is the sums of squares for X_{1}, and n is the number of
observations for the first treatment level.
SS_{Regression (1)} 
= 
(SS_{XY (1)})^{2}/ SS_{X (1)} 
where
 SS_{Regression (1)} is the regression sums of squares for the first treatment level.
SS_{Error (1)} 
= 
SS_{Total (1)} − SS_{Regression (1)} 
where
 SS_{Error (1)} is the error or residual sums of squares for the first treatment
level
b_{1}
 = 
SS_{XY (1)} / SS_{X (1)} 
where
 b_{1} is the slope of the regression line for the first treatment level.

Step 3. Summed regression statistics
The regression statistics for each treatment level are summed as follows.
SS_{Total} = SS_{Total (1)} + SS_{Total (2)} + ... + SS_{Total
(a)}
SS_{XY} = SS_{XY (1)} + SS_{XY (2)} + ... + SS_{XY (a)
}
SS_{X} = SS_{X (1)} + SS_{X (2)} + ... + SS_{X (a)}
SS_{Reg (summed)} = SS_{Reg (1)} + SS_{Reg (2)} + ... +
SS_{Reg (a)}
SS_{Error} = SS_{Error (1)} + SS_{Error (2)} + ... + SS_{Error
(a)}
b_{common} = SS_{XY} / SS_{X}
SS_{Reg (common slope)} = (SS_{XY})^{2} /
SS_{X}.

Step 4. Assess heterogeneity of slopes
If slope of relationship between X and Y was the same at each treatment level, SS_{Reg
(common slope)} would be the same as SS_{Reg (summed)}. The difference between
the two is a measure of the heterogeneity of slopes.
Hence SS_{Heterogeneity of slopes} = SS_{Reg (summed)} − SS_{Reg (common slope)}

Mean squares are obtained by dividing sums of squares by their respective degrees of freedom. The
significance test is carried out by dividing the mean square regression by the mean square error. Under a
null hypothesis of a zero slope this Fratio will be distributed as F with 1 and n
− 2 degrees of freedom.
Source of variation Regression Error Total
 df 1 n − 2 n − 1
 SS SS_{Reg} SS_{Error} SS_{Total
}  MS MS_{Reg} (s^{2}_{}) MS_{Error} (s^{2}_{Y.X})
 Fratio MS_{Reg} /
MS_{Error}
 P value

Adjusted means
' = _{1} −
b ( )
Dealing with nonparallel lines
Pickapoint approach
For example, if we have treatment (A) with two levels and a continuous covariate (B)
Effect size (b1B=θ) = b_{1} + b_{3}θ
where b_{1} is the coefficient for (A), b_{3} is the coefficient for the interaction (A
× B), and θ is the chosen value of the covariate (B). Then estimate the
standard error of effect size:
Standard error of effect size (s_{b1B=θ}) = √ s^{2}_{b1} + 2 θs^{2}_{b1b3} + θ^{2}s^{2}_{b3}
Assumptions
 ANOVA assumptions
Observations are independent from observation to observation. Residuals are randomly and normally distributed. Variances between groups are homogeneous (ANOVA assumptions).
 Regression assumptions
The relationship between Y and X must be linear for each treatment group (although some forms of nonlinearlity can be dealt with by including a polynomial term as an extra covariate). In addition errors (deviations from the fitted lines) must be independent of the values of X and normally distributed.
 Specific ANCOVA assumptions
The model assumes that the covariate is independent of the treatment effect. In other words the distribution of values of the confounder should be the same at each treatment level or (more importantly) the (parametric) mean value of the covariate is the same for each group. A further specific (but optional) assumption is homogeneity of slopes. It is optional because it is only required to simplify the model for estimation of adjusted means.