![]() Biology, images, analysis, design... |
|
"It has long been an axiom of mine that the little things are infinitely the most important" |
|
Binomial and Poisson distributions: Use & misuse(standard error of proportion and rate, assumptions, cluster sampling, pattern of dispersion, test of randomness)Statistics courses, especially for biologists, assume formulae = understanding and teach how to do Use and MisuseThe mathematics of the binomial distribution provides a short-cut method to estimate the variance of a proportion derived from a simple random sample, given the values of p, q and n. This approach is heavily used in medical statistics to estimate the standard error (and from this the confidence interval) of disease prevalence and proportion cured. However, as with estimating any standard error, the sample must be random to ensure independence of observations. If one is dealing with a convenience sample, The Poisson distribution is also used to estimate standard errors, in this case to frequencies and (especially) rates. Again it is essential that events are independent. A common misuse is to use the Poisson to attach a standard error to a rate derived from pooling events from different clusters, whether villages or herds. The Poisson is also much used to test randomness over space or time by compared observed frequencies with those expected under the Poisson distribution. We give several examples of this, most of which demonstrate some of the pitfalls in this approach. One such pitfall is that statistical assessment of goodness of fit is very dependent on sample size. Failing to demonstrate a significant difference between observed and expected frequencies is not the same as demonstrating a 'good fit'. Another pitfall is that it depends critically on scale - for example one will get quite different distributions if one looks at numbers per leaf, per branch or per tree. Both the binomial and Poisson distributions are also very important in modelling relationships between response and explanatory variables - where in certain situations they describe the error structure much better than the normal distribution. We will get ahead of ourselves to give too much attention to this here, although in one veterinary example (cases of mastitis) the interest in the distribution resulted from just such a desire. What the statisticians sayArmitage & Berry (2002)![]() ![]() ![]() ![]() ![]() ![]() Griffiths (2006) Wikipedia provides sections on discrete probability distributions,
|