Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Coefficient of variation: Use and misuse
(intra-assay and inter-assay, within subject, temporal variability)
Statistics courses, especially for biologists, assume formulae = understanding and teach how to do statistics, but largely ignore what those procedures assume, and how their results mislead when those assumptions are unreasonable. The resulting misuse is, shall we say, predictable...
Use and MisuseThe coefficient of variation of the observations is used to describe the level of variability within a population independently of the absolute values of the observations. If absolute values are similar, populations can be compared using their standard deviations. But if they differ markedly (for example, the weights of mice and elephants), or are of different variables (for example, weight and height), then you need to use a standardized measure - such as the coefficient of variation. The coefficient of variation (CV) for a sample is the standard deviation of the observations divided by the mean. The most common use of the coefficient of variation is to assess the precision of a technique. It is also used as a measure of variability when the standard deviation is proportional to the mean, and as a means to compare variability of measurements made in different units.
Veterinary microbiologists seem to be especially keen on using the coefficient of variation of the observations as a measure of repeatability. A common misuse is that only repeatability is assessed, when in fact an assessment of validity is also required. There is no point being able to reliably get the same incorrect answer over and over again. It is true that validity is usually much harder to assess than repeatability, but that does not mean that only the latter should be considered. Another misuse is to quote CV values - and then ignore them. This reflects the predilection for only assessing outcome in terms of the mean (or median), rather than also considering effects on levels of variability.
Even where they are commented on, some workers do not follow accepted conventions on what a 'good' level of repeatability is. Inappropriate or unspecified methods are often used to estimate within subject coefficient of variation. Another problem is that often very little information is given on how the coefficient of variation is estimated, so that its reliability cannot be assessed. Lastly, we found that some veterinary researchers only estimated intra-assay and inter-assay coefficients of variation after excluding 'outliers', apparently just to bring the coefficient of variation down to acceptable levels. This seems to rather defeat the whole point of assessing variability!
Other uses (and misuses) of the coefficient of variation are many and varied, and we meet some of these in the ecological and wildlife examples. The coefficient of variation is underused (rather than overused) as a measure of temporal or spatial variability. Some researchers still use standard deviations for variables where the standard deviation is directly proportional to the mean - instead such variables should be log transformed, or alternatively the coefficient of variation used to describe variability. We have included a few examples of its correct use for these purposes. We have also included a couple of examples of the coefficient of variation of the mean (standard error/mean) in the wildlife section.
What the statisticians saySokal & Rohlf (1995) and Zar (1999) provide basic accounts of the coefficient of variation . Diamandis & Christopoulos (1996) detail how the coefficient of variation is used for assessing precision in immunoassays, Snedecor & Cochran (1989) look at its uses for assessing variability in agricultural experiments, whilst Simpson et al. (1960) examine its use for morphological measurements. Krebs (1999) discusses the use of the coefficient of variation for measuring temporal variability. He emphasizes that it is only appropriate when the slope of Taylor's Power Law is equal to 2 (i.e. the standard deviation is proportional to the mean).
Bland & Altman (1996) explain the log method for calculating the within-subject coefficient of variation. Shoukri et al. (2006) investigate the validity of its normal approximation confidence interval, and Liu et al. (2006) provides exact confidence bounds for the statistic. McLaughlin et al. (1998) assess the value of the coefficient of variation in assessing reproducibility of ECG measurements. McArdle et al. (1990) and Gaston & McArdle (1994) explain why the coefficient of variation is the best measure of the variability of population size over time if there are zeros in the data. Eberhardt (1978) discusses use of the coefficient of variation in appraising variability in population studies, whilst Patel et al. (2001) provides a more recent assessment of its use for assessing variability in agricultural experiments. Bedeian (2000) and Sørensen (2002) review the use and misuse of the coefficient of variation to compare diversity on the social sciences.
Wikipedia provides a section on the coefficient of variation. Martin Bland provides an excellent discussion on different ways to calculate the within-subject coefficient of variation . The National Centre for Health Statistics gives a short account of the relative standard error (= coefficient of variation of the mean). Poultry Health Services covers the intra- and interclass CVs for assay quality control, whilst Will Hopkins covers the intra-class coefficient of variation in relation to 'sports science'.