Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)



Z scores: Use & misuse

(standard deviations, reference population, shape of distribution, weight-for-age, validity)

Statistics courses, especially for biologists, assume formulae = understanding and teach how to do  statistics, but largely ignore what those procedures assume,  and how their results mislead when those assumptions are unreasonable. The resulting misuse is, shall we say, predictable...

Use and Misuse

A Z-score (or standard score) represents how many standard deviations a given measurement deviates from the mean. In other words it merely re-scales, or standardizes, your data. A Z-score serves to specify the precise location of each observation within a distribution. The sign of the Z-score (+ or - ) indicates whether the score is above (+) or below ( - ) the mean. A Z-score is calculated by subtracting the mean value from the value of the observation, and dividing by the standard deviation. Commonly a known reference population mean and standard deviation are used.

If your Z-score distribution is based on the sample mean and sample standard deviation, then the mean and standard deviation of the Z-score distribution will equal zero and one respectively. If your Z-score distribution is based on the population mean and population standard deviation, then the mean and the standard deviation of the Z-score distribution will only approximate to zero and one if the sample is random. The shape of a Z-score distribution will be identical to the original distribution of the raw measurements. If the original distribution is normal, then the Z-score distribution will be normal, and you will be dealing with a standard normal distribution. You can then make assumptions about the proportion of observations below or above specific Z-values. If however, the original distribution is skewed, then the Z-score distribution will also be skewed. In other words converting data to Z-scores does not normalize the distribution of that data!

In some applications (such as weight-for-age in nutritional studies), the Z-scores are not based upon the known population mean and standard deviation, but on an external reference population. In this situation the Z-scores are used to identify those individuals in the sample falling below a specified Z-score. Sometimes the distribution of the whole sample is examined, in which case the Z-scores will not have a mean of zero and a standard deviation of one - what is of interest is the extent to which their distribution differs from the reference population.

In biostatistics probably the commonest use of Z-scores is in the analysis of human nutritional data, especially for children. Weight for age, height for age, and weight for height Z-scores are computed using international reference data intended to reflect human growth patterns under optimal conditions. Cut-off scores of -2 and -3 are used to identify children suffering from malnutrition. Mean Z-scores are used to evaluate the nutritional state of populations relative to the reference population.

The validity of nutritional Z-scores depends on the validity of the current reference values irrespective of ethnic group. Whilst that may usually be the case, it should not be automatically assumed. One misuse of Z-scores is to use the cut-offs of +2 and +3 to assess obesity - the body mass index is more appropriate for this. Another misuse may be to use the relatively sophisticated Z-score in a famine emergency, when mid upper arm circumference may be the more appropriate diagnostic tool.

Z-scores are used much less in other disciplines. We give examples of their use for standardizing variables prior to analysis, and for removing the effect of an explanatory variable by standardizing the data to the mean value of each level of that variable. The latter practice can be advantageous, although other methods of analysis (such as covariance analysis) are often preferable. One clear misuse, which is fortunately rare, is the belief that standardizing measurements will also normalize them. This is certainly not the case.

What the statisticians say

Norman & Streiner (2008), Gravetter & Wallnau (2006) and Heiman (2001) all provide reasonable coverage of Z-scores. Sokal & Rohlf (1995) and Griffiths et al. (1998) both mention Z-scores in relation to the normal distribution. Coverage of the use of Z-scores in human nutritional studies is given in Gibson (2005) and World Health Organization (2000).

Broeck et al. (2009), Seal & Kerac (2007) and De Onis et al. (2006) explore the implications of the latest WHO Child Growth standards for operational work and for future anthropometric research. De Onis et al. (1999) and Mei et al. (1997) look at the use of mid upper-arm circumference for age and height as nutritional status screening indicators.

Gorstein et al. (1994) looks at issues in the assessment of nutritional state using anthropometry. Quinn (1992) provides a user's manual for conducting child nutrition surveys in developing countries, together with the International NCHS Child Reference Tables. Dibley et al. (1987a) (1987b) describe the development of "normalized" curves for the international growth reference, whilst WHO Working Group (1986) looks at the various indices of weight and height and at the biological significance of wasting and stunting.

Wikipedia provides sections on Z-scores (termed standard scores) and body mass index. World Health Organization provide software for nutritional analysis using the new WHO reference. The London School of Tropical Medicine & Hygiene covers the calculation of human nutritional Z-scores. Various universities provide tutorials on Z-scores, including University of Arkansas.