Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Standard error of the mean: Use and misuse
(standard error versus standard deviation, inferential statistic)
Statistics courses, especially for biologists, assume formulae = understanding and teach how to do statistics, but largely ignore what those procedures assume, and how their results mislead when those assumptions are unreasonable. The resulting misuse is, shall we say, predictable...
Use and Misuse
The standard error of the mean is the standard deviation of the sampling distribution of the mean. In other words it is the standard deviation of a large number of sample means of the same sample size drawn from the same population. The term standard error of the mean is commonly (though imprecisely) shortened to just standard error. Thus the terms 'standard error of the mean', 'standard deviation of the mean' and 'standard error' may all mean exactly the same thing! Usually of course we only calculate one mean for a set of data, not multiple means. Hence, unlike the standard deviation of the observations, the standard error of the mean is estimated rather than measured. As such it is an inferential rather than a descriptive statistic.
The standard error of the mean (SE, or SEM) is somewhat unusual in that there is a simple algebraic formula for it. It is equal to the population standard deviation (s) divided by the square root of the number of observations in that sample. In practice we generally obtain an unbiased estimate of the standard error of a mean by dividing the sample standard deviation (s) by the square root of the number of observations in that sample.
The inconsistent terminology means that it is a serious misuse of the standard error is to call it the 'standard deviation' without any qualifiers. This arises when a researcher says he is quoting 'the mean and standard deviation'. In this situation we do not know if he/she means the standard deviation of the obervations or the standard deviation of the mean, namely its standard error. Confusing the two is obviously misleading, because the standard deviation of the mean is always smaller than the standard deviation of the observations. If one sees very small 'standard deviations' in a paper, one should try to check them, for example by estimating the standard deviation with the range estimator if the maximum and minimum are given. Best practice is to state clearly what you are giving, or better still, give both the standard error and the standard deviation - and the sample size.
A related misuse of the standard error is to use it as a descriptive statistic when it is in fact an inferential statistic. Providing distributions are not skewed, the standard deviation is the correct descriptive statistic to use as an indicator of variability between observations. The standard error only reflects this variability for a particular sample size. If distributions are skewed, or if the variable is only measured on the ordinal scale, then both the standard deviation and the standard error are misleading (albeit not actually incorrect). In these situations the best descriptive statistics are given by the five quantile summary. Alternatively if a transformation is used to normalize distributions, then either transformed means and standard errors or detransformed means and confidence intervals (not standard errors) should be presented.
Standard errors can be estimated for any statistic, although they often necessitate more assumptions than are required for standard error of the mean. For example the standard errors of the median and coefficient of variation are often invalid because assumptions of normality are not met.
What the statisticians sayWoodward (1999) gives only brief consideration to the standard error in Chapter 2, grouping it with descriptive statistics like the coefficient of variation rather than dealing with it (more correctly) as an inferential statistic. Bart et al. (1998) introduce the standard deviation and standard error in Chapter 2 - together with their classic account of the biologist and the moose, and why one needs to understand the difference between the standard deviation and the standard error. Krebs (1999) covers the finite population correction in Chapter 8 on sampling designs. Sokal & Rohlf (1995) provide a fairly detailed account of the standard error and variance of means in Chapter 7, together with a useful section on the standard errors (and requirements for their applicability) of several other statistics.
Curran-Everett (2009) and Altman & Bland (2005) both give useful reviews of the difference between the standard deviation (of the observations) and the standard error (of the mean). Further comments on use of the standard deviation versus the standard error are given by Webster & Merry (1997), Streiner (1996), and Brown (1982) whilst Nagele (2001) surveys their use in some medical journals. Cumming et al. (2007) suggests some simple rules to assist experimental biologists in the interpretation of error bars. Anderson et al. (2001) discuss the presentation of the results of data analyses, with special reference to wildlife biologists. Magnusson (2000) argues that standard error bars on graphs conceal more than they reveal and should be replaced by dot histograms. Abdi & William (2010) provide a useful review of the jackknife technique following on the much earlier review of Miller (1974) .