InfluentialPoints.com Biology, images, analysis, design... 

"It has long been an axiom of mine that the little things are infinitely the most important" 

Z scores: Use & misuse(standard deviations, reference population, shape of distribution, weightforage, validity)Statistics courses, especially for biologists, assume formulae = understanding and teach how to do statistics, but largely ignore what those procedures assume, and how their results mislead when those assumptions are unreasonable. The resulting misuse is, shall we say, predictable... Use and MisuseA Zscore (or standard score) represents how many standard deviations a given measurement deviates from the mean. In other words it merely rescales, or standardizes, your data. A Zscore serves to specify the precise location of each observation within a distribution. The sign of the Zscore (+ or  ) indicates whether the score is above (+) or below (  ) the mean. A Zscore is calculated by subtracting the mean value from the value of the observation, and dividing by the standard deviation. Commonly a known reference population mean and standard deviation are used. If your Zscore distribution is based on the sample mean and sample standard deviation, then the mean and standard deviation of the Zscore distribution will equal zero and one respectively. If your Zscore distribution is based on the population mean and population standard deviation, then the mean and the standard deviation of the Zscore distribution will only approximate to zero and one if the sample is random. The shape of a Zscore distribution will be identical to the original distribution of the raw measurements. If the original distribution is normal, then the Zscore distribution will be normal, and you will be dealing with a standard normal distribution. You can then make assumptions about the proportion of observations below or above specific Zvalues. If however, the original distribution is skewed, then the Zscore distribution will also be skewed. In other words converting data to Zscores does not normalize the distribution of that data! In some applications (such as weightforage in nutritional studies), the Zscores are not based upon the known population mean and standard deviation, but on an external reference population. In this situation the Zscores are used to identify those individuals in the sample falling below a specified Zscore. Sometimes the distribution of the whole sample is examined, in which case the Zscores will not have a mean of zero and a standard deviation of one  what is of interest is the extent to which their distribution differs from the reference population. In biostatistics probably the commonest use of Zscores is in the analysis of human nutritional data, especially for children. Weight for age, height for age, and weight for height Zscores are computed using international reference data intended to reflect human growth patterns under optimal conditions. Cutoff scores of 2 and 3 are used to identify children suffering from malnutrition. Mean Zscores are used to evaluate the nutritional state of populations relative to the reference population. The validity of nutritional Zscores depends on the validity of the current reference values irrespective of ethnic group. Whilst that may usually be the case, it should not be automatically assumed. One misuse of Zscores is to use the cutoffs of +2 and +3 to assess obesity  the body mass index is more appropriate for this. Another misuse may be to use the relatively sophisticated Zscore in a famine emergency, when mid upper arm circumference may be the more appropriate diagnostic tool. Zscores are used much less in other disciplines. We give examples of their use for standardizing variables prior to analysis, and for removing the effect of an explanatory variable by standardizing the data to the mean value of each level of that variable. The latter practice can be advantageous, although other methods of analysis (such as covariance analysis) are often preferable. One clear misuse, which is fortunately rare, is the belief that standardizing measurements will also normalize them. This is certainly not the case. What the statisticians sayNorman & Streiner (2008), Gravetter & Wallnau (2006) and Heiman (2001) all provide reasonable coverage of Zscores. Sokal & Rohlf (1995) and Griffiths et al. (1998) both mention Zscores in relation to the normal distribution. Coverage of the use of Zscores in human nutritional studies is given in Gibson (2005) and World Health Organization (2000). Broeck et al. (2009), Seal & Kerac (2007) and De Onis et al. (2006) explore the implications of the latest WHO Child Growth standards for operational work and for future anthropometric research. De Onis et al. (1999) and Mei et al. (1997) look at the use of mid upperarm circumference for age and height as nutritional status screening indicators. Gorstein et al. (1994) looks at issues in the assessment of nutritional state using anthropometry. Quinn (1992) provides a user's manual for conducting child nutrition surveys in developing countries, together with the International NCHS Child Reference Tables. Dibley et al. (1987a) (1987b) describe the development of "normalized" curves for the international growth reference, whilst WHO Working Group (1986) looks at the various indices of weight and height and at the biological significance of wasting and stunting. Wikipedia provides sections on Zscores (termed standard scores) and body mass index. World Health Organization provide software for nutritional analysis using the new WHO reference. The London School of Tropical Medicine & Hygiene covers the calculation of human nutritional Zscores. Various universities provide tutorials on Zscores, including University of Arkansas.
