InfluentialPoints.com Biology, images, analysis, design... 

"It has long been an axiom of mine that the little things are infinitely the most important" 

Variance and standard deviationOn this page: Variance definition Standard deviation definition Withinsubject standard deviation Assumptions & RequirementsVarianceDefinitionThe variance provides a measure of spread or dispersion of a population. It is computed as the average of the squared deviations of the observations from their mean, hence its alternative name mean square error. If you were able to measure every member of the population, this is the equation we would use. But more usually you take a sample, and use the variance of your sample to estimate the population variance. If you work out the variance of your sample using the equation above, it will underestimate the true value of the population variance. In other words it is a biased estimator of the population variance. Correction for biasYou correct for this bias by dividing by n − 1 (where n is the number of observations), rather than by n. Hence the sample variance is given by the sum of the squared deviations of the observations from their mean divided by
Because you have squared the deviations from the mean, the
variance is expressed in squared Alternative formulae for the variance
Standard deviationDefinitionThe standard deviation of a population is simply the square root of the population variance. It can also be described as the root mean squared deviation from the mean.
Similarly the standard deviation of a sample is the square root of the sample variance:
Correction for biasWe noted above that the sample variance (s^{2}) is corrected for bias by dividing by
This correction, however, makes little difference to the estimate of the standard For small sample sizes the correction is more
Alternatively tables are available that give the correction factor directly for small It is only really necessary to do this correction if you have small sample sizes, and you are quoting standard deviations. If you are estimating the standard deviation in order to then estimate the coefficient of variation or the confidence limits of the mean, you should not correct the standard deviation for bias. This is because the equations for these statistics include the necessary corrections. With the availability of personal computers, few people still use a calculator for doing statistics. However, statistical packages often have some 'bugs'. So it is wise to give these packages some small 'test' data sets, so you can easily check the results 'by hand'. In addition some packages do not actually tell you whether or not certain corrections have been applied. You can only find out by running a test data set.
Withinsubject standard deviationThis statistic provides a useful measure of both reproducibility (same test material sent to different laboratories) and repeatability (same test material analyzed by same person in same laboratory). In other words, it describes the random component of measurement error. It also goes under the rather misleading name standard error of measurement. (This may be abbreviated to SEM, but it has nothing to do with the standard error of the mean which is also abbreviated to SEM.) For instance you might subdivide a single blood sample, and send each subsample to a different laboratory for haemoglobin assay. The standard deviation of their various results could be used to describe the measurement error. In practice this sort of assessment uses a number of samples (say each from a different patient)  and each original sample is subsampled and independently assayed. Because the original samples were not identical, the results you obtain will include the variation between patients  in addition to the measurement error. Therefore, simply pooling the results, and calculating the standard deviation of the errors (about their, single, common mean) will overestimate the variation arising from measurement error. There are two obvious ways of avoiding this problem:
Although this method works perfectly well if you have the same number of measurements for each individual, a more flexible approach to carry out what is called a one way analysis of variance. Taking the square root of the 'residual mean square' will give you the withinsubject standard deviation. We cover the analysis of variance approach in If there are only two measurements per original subject, there is a simpler formula because the variance of two observations is equal to half the square of their difference. So, the withinsubject standard deviation can be obtained as follows:
This approach to getting a withinsubject standard deviation is only valid if the standard deviation is independent of the mean. This can be checked by plotting the standard deviation for each individual against the individual's mean. If there is a relationship, the data should first be transformed (a log transformation is often effective) before replotting to ensure this assumption is met. The within subject standard deviation can also be used to quantify measurement error in repeated measurements over time. However, this will only reflect measurement error alone if there is no trend over time. If there is, the withinsubject standard deviation will overestimate the amount of measurement variation. A common error is to use Pearson's correlation coefficient (see Unfortunately, this correlation coefficient suffers from a problem  the more your original subjects (or samples) differ, the more this correlation coefficient will underestimate the measurement variation. Worse still, if there was any consistent trend in your results over time, that source of variation will bias the correlation coefficient. Fortunately there is a correlation coefficient which can be used  namely the 'intraclass correlation coefficient'. However, this can only be obtained after carrying out an analysis of variance which is described in
Assumptions and RequirementsThe variance and standard deviation can be calculated for any variable  providing it can be ordered. But the standard deviation is only an appropriate measure of dispersion for a measurement variable, and only then if the data have a symmetrical distribution  and, in many cases, a normal one. Use of the standard deviation to display the variability of observations in range plots and boxandwhisker plots is misleading if these assumptions are not met. Assumptions about what proportion of observations are included within limits of agreement are also dependent upon this assumption. For nonsymmetrical (skewed) distributions there are two options:
For the withinsubject standard deviation, it is assumed that the size of the deviation is not related to the magnitude of the measurement. This can be assessed graphically, by plotting the individual subject's standard deviations against their means. We stress here that the standard deviation is only a measure of the variability of your observations. It is not a measure of the variability or the reliability of your estimated mean. We will come to measures of the variability and reliability of means later in this unit.
