Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
The standard error of the mean is the standard deviation of the sampling distribution of the mean. In other words it is the standard deviation of a large number of sample means of the same sample size drawn from the same population. The term standard error of the mean is commonly (though imprecisely) shortened to just standard error. Thus the terms 'standard error of the mean', 'standard deviation of the mean' and 'standard error' may all mean exactly the same thing!
Usually of course we only calculate one mean for a set of data, not multiple means. Hence, unlike the standard deviation of the observations, the standard error of the mean is estimated rather than measured. As such it is an inferential statistic rather than a descriptive statistic.
The standard error of the mean (SE ) is somewhat unusual in that there is a simple algebraic formula for it - and the formula is valid irrespective of the distribution of the data. It is equal to the population standard deviation (σ) divided by the square root of the number of observations in that sample. In practice we obtain an unbiased estimate of the standard error of a mean by dividing the sample standard deviation (s) by the square root of the number of observations in that sample. -
Hence the magnitude of the standard error of the mean depends on both the variability of the observations (s) and the number of observations (n). The larger the variability, the greater will be the standard error. The larger the number of observations, the smaller will be the standard error.
The standard deviation of the observations and the standard error of the mean are frequently confused.
We noted previously that the sample standard deviation (s) is a biased estimator of the population standard deviation (σ). So, if you intend to quote standard errors, and your sample sizes are small, you should use the corrected standard deviation in the formula.
The finite population correction
The formulae given above for estimating the standard error assume you are taking a sample from an infinite population. But if your sample comprises a large part of the population, the usual equation for the standard error will over-estimate the standard error. Imagine you take several samples each comprising nearly all members of a population. Clearly there will not be much variability between sample means because the samples will mostly contain the same individuals. In the most extreme case, if your sample contained the entire population you would get the same value for the mean each time - in which case the standard error of the mean should be zero. Hence the need for a finite population correction.
If you look at the formula below, you will see that it reduces the standard error more and more as the sample size approaches the population size. If the sample size equals the population size, the standard error will be zero.
In practice the finite population correction is usually only used if a sample comprises more than about 5-10% of the population. Even then it may not be applied if researchers wish to invoke the superpopulation concept', and apply their results to a larger, ill-defined, population. This concept, whilst convenient for some, is highly controversial - partly because the problems of extending result to a superpopulation are exactly the same as when you are dealing with an ordinary population. In particular, you need to allow for variation and bias - which can be very difficult when a superpopulation is ill defined and the selection is not random! For further discussion on this see pp. 97-99 in Bart et al.
Any statistic that can be computed - such as the variance, the coefficient of variation, or the median - also has a sampling distribution. Hence, any statistic has a standard error that can be used to describe its sampling variation. Even the standard deviation itself must exhibit variation in repeated samples, so it also has a standard error. However, many commonly-used statistics either do not have a simple formula to estimate their standard error, or (more commonly) the formula assumes your sample is very large, or your sample represents a particular type of population. The standard errors of some commonly used statistics are given above in related topics.
Assumptions and Requirements
The most important assumption in estimating the standard error of a mean is that the observations are equally likely to be obtained, and are independent. In other words the sample should have been obtained by random sampling or random allocation.
It can be calculated for any type of frequency distribution - but like the standard deviation, for most of the things it is used for, the statistic is assumed to be distributed symmetrically.
The standard error of the mean is quoted very widely in the reporting of scientific data. It is a valid estimate of the variability of our estimate of the mean - but not of the variability of the observations.
Providing the sample size is reasonably large, and is a random sample from the population, it may also be used as an indirect measure of the reliability of our estimate of the mean. This is because the 95% confidence interval is roughly twice the standard error - providing the statistic is distributed normally. However (as we see in