InfluentialPoints.com Biology, images, analysis, design... 

"It has long been an axiom of mine that the little things are infinitely the most important" 

Confidence intervals of proportions and ratesOn this page: Definition & properties Simple normal approximation to binomial, Wald, interval Continuitycorrected Wald interval, Adjusted, modified, Wald interval Finite population correction Score method binomial interval Fleiss score interval Exact binomial, ClopperPearson Exact midP interval Poisson methods for counts & rates Exact Poisson interval for counts Poisson interval for incidence rates AssumptionsDefinition and propertiesEstimating the confidence interval of a proportion (or count) is a much more controversial operation than doing the same for a mean. This controversy stems from the fact that for many years textbooks have promoted the simple normal approximation binomial interval for all situations other than small samples and very small proportions. Such intervals are easy to understand and to calculate, but make unrealistic assumptions  in particular that the variance is independent of the mean. This interval is now known to be deeply flawed, even when calculated from moderate proportions or surprisingly large samples, and many statisticians say they should not be used under any circumstances. An alternative to the simple normal approximation interval is an 'exact' interval. An exact interval is best described as one that it derived directly from an appropriate model describing how the statistic in question might vary  in this case the binomial distribution. Such intervals are calculated in a different way from normal approximation intervals, leading to an alternative definition of a confidence interval. In principle, the confidence interval of a proportion or count may be defined in 2 ways:
Unfortunately in text books the term exact confidence interval has been interpreted in several, mutually contradictory, ways:
Whilst the first viewpoint is correct but by no means sufficient, the second is simplistic and incorrect. The third is almost impossible for a discrete statistic such as a proportion  unless P approaches 0.5 and the sample size (n) approaches infinity. The fourth definition, using conventional Pvalues, is what many statisticians would propose  albeit such a definition tends to yield unduly conservative intervals  and correspondingly biased inference. The last definition, which covers midPintervals is much less biased, but only when considered over a range of parameters. It also leaves open what range of parameters, or what range of coverages, are reasonable. None of these definitions explicitly allow for how the interval ought to be located about the proportion. In fact the correct definition of exact is simply that exact intervals for a proportion use the binomial distribution to estimate probabilities  but with the implicit assumption that the binomial model is the correct model to describe how the sample was gathered. We also need to clarify the meaning of midP intervals. There are both approximate midP intervals and exact midP intervals. Approximate midP intervals are those like the normal approximation and score intervals that do not have a continuity correction. They tend not to be too conservative  but can be much too liberal. Exact midP intervals come much closer to having on average a coverage which approaches 1−α  and have been proposed to be the new 'gold standard'. We start below with the (now discredited) simple normal approximation, not because we advocate its use, but because it provides the basis for a new approximate method  the adjusted Wald interval. This is increasingly being recommended as a replacement for the simple normal interval. We then give the formulae for two score intervals. These take into account the dependence of the variance on the mean, but still assume a normal distribution. Lastly we consider the two main exact intervals  the conventional ClopperPearson interval (currently still regarded as the gold standard despite its conservatism) which can be obtained formulaically, and a midP exact interval that is readily obtained using R. We provide additional details on properties with each interval.
Normal approximation binomial intervalsSimple normal approximation (Wald interval)For a large sample (n >100) and a moderate proportion (0.3 < p < 0.7), the traditional approach was to use p and q (in place of P and Q) to estimate the standard error. The mean variance relationship was then ignored, enabling the following simple formulation to be used:
Some authorities (for example Cochran Simple normal intervals are commonly known as 'simple asymptotic' or 'Wald' intervals. Although popular and simple to calculate, they suffer from several important defects.
Continuitycorrected Wald intervalIf the sample size lies between about 20 and 100, it was usual to apply a continuity correction  by adding a half divided by the sample size to the upper limit, and subtracting a half divided by the sample size to the lower limit. In other words, this correction expands the interval by 1/n. However, some statisticians argue that, although this correction makes no worthwhile difference for large samples, it may cause overcoverage for small samples unless p is close to 0.5. To prevent this, where intervals for proportions are estimated by testing the difference between a parameter and an observed proportion, it is recommended that this correction be omitted if their difference is less than the continuity correction.
Whilst these intervals share most of the defects of 'uncorrected'
Adjusted (modified) Wald intervalThe adjusted Wald interval was proposed by Agresti & Coull (1998). The estimate of the proportion is first modified to give the Wilson point estimator (p_{W}) thus:
The modified Wald interval has the same point estimator as the Wilson score interval (see below), and its formulation is simply an approximation to that interval. Not surprisingly it has similar properties to the Wilson score interval (and has even been claimed to have improved properties in some respects), and has been recommended by several authorities when n > 40. It is always preferable to the simple Wald interval, and is very simple to calculate. Note , however, it does have some undesirable properties when p is close to zero or 1, and values of the interval must be truncated. Finite population correctionThe formulations above all assume that the sample size is very small compared to the total population size. Sometimes this may not be the case as, for example, when carrying out ecological studies on endangered species or breeds. If the sample comprises amore than 20% of the population, it is necessary to apply the finite population correction. This is given by multiplying the standard error by the correction factor (1 minus the proportion of population sampled).
This correction is usually applied to the Wald interval, so not surprisingly these intervals share the properties of simple normal
Score method binomial intervalsThese are known as score methods because the central component of the calculating the interval involves carrying out a score test. A score test is a particular type of parametric test that can be formulated in situations where the variability is difficult to estimate. Score methods are appropriate for any proportion providing n is large  or, more precisely, providing PQn is greater than five. They are equivalent to an unequal variance normal approximation testinversion, without a tcorrection. The limits are obtained by a quadratic method, not graphically. Wilson score intervalFleiss score intervalThis is the same as the Wilson score interval but includes a continuity correction.
Given their rather complex formulation, these method have been little used in the past  although in recent years they have come into favour as they are much more accurate than the simple normal approximation  providing the sample size is large. Although not immediately obvious from the formula, this improved accuracy is because it allows for the fact that the variance (PQn) is not homogenous. For large n and nonextreme P the properties of Wilson score intervals approach those of midP exact intervals, and Fleiss intervals approach those of ClopperPearson intervals (see below). Depending upon whether you prefer an average or a minimum 95% coverage, these score intervals do not collapse or overshoot, are located reasonably, and have good coverage properties. Fleiss intervals seldom have coverage below 95%.
Exact binomial intervalsExact ClopperPearson interval (conventional P)Exact binomial intervals were originally obtained by inversion of the equaltail binomial test  as suggested by Clopper& Pearson (1934) using conventional Pvalues. Until fairly recently, this meant most people had to use tables generated by mathematicians, which gave the confidence intervals for all proportions for small sample sizes. These tables can be found in a number of statistical texts, for example table A4 in Conover Alternatively there are formulaic methods derived from a mathematical relationship between the binomial distribution and various continuous distributions including the betabinomial and the Fdistribution. Despite their usefulness, they are still only given in a few statistical textbooks. We give the formulation provided by Bart et al.
Exact Clopper Pearson binomial intervals using conventionalPvalues have a minimum coverage close to 95%  and very good location. Be aware that, although this provides an exact interval, it overcovers  if variation is purely binomial its coverage is at least 95%, but its maximum coverage can be excessively high. However, exact intervals neither overshoot the 0 to 1 range, and are not liable to produce intervals of zero width (collapse). Exact midP intervalAlthough many statisticians still consider the ClopperPearson interval as providing the 'gold standard' for the binomial proportion interval, many now advocate that the midP exact interval should take over this role. However, bear in mind that, because we are dealing with discrete variables, no interval gives precisely the correct (nominal) As far as we know, a midP exact interval can only be obtained by test inversion  not by any formulaic method. Although virtually any confidence interval can be obtained by test inversion, because it is relatively computerintensive, this method is generally confined to intervals that cannot be obtained any other way  or for which there is no 'suitable approximation'. Test inversion intervals work under the definition that a confidence interval about an observed statistic encloses a range of parameters which, when tested, would not reject that observed statistic. MidP exact binomial intervals also have good location properties, and provide close to a mean coverage of 95%  although there can be considerable variation where P approaches 0 or 1. Unfortunately, being less conventional, few packages calculate midP exact binomial intervals. We explain how test inversion is carried out in the worked example below.
Poisson methods for counts and ratesNormal approximation Poisson interval for countNormal approximation interval for a count (f) is obtained simply by taking by using its square root as an estimate of the standard error:
Normal approximation Poisson intervals share many of the properties of simple normal intervals for a Exact Poisson intervals for countsAgain, until relatively recently, the most frequently used method was to use tables generated by mathematicians which gave exact Poisson confidence intervals for all small counts. However, the exact confidence interval for a count (Y) is readily obtained from the relationship between the chisquare distribution and the Poisson distribution. The appropriate degrees of freedom must be calculated separately for the upper and lower limits (remember we use the same system as R so χ^{2}_{0.025, } is the chi square quantile for upper tail probability of 0.025; this is the opposite way to that in which statistical tables are usually done). Like ClopperPearson intervals, this gives a conventionalP interval, and is therefore conservative.
Poisson intervals for incidence ratesAn incidence rate is obtained by dividing the number of events by persontimeatrisk. A 95% CI for the incidence rate in a cohort study can therefore be obtained by treating the numerator as a Poisson variate, working out the upper and lower confidence limits for the number of events, and then dividing each by the persontimeatrisk which is regarded as fixed and measured without error. The same approach is sometimes used in descriptive studies substituting midyear population size for persontimeatrisk. However, the interval is likely to be unreliable because midyear population size is certainly not measured without error. Note that if Poisson intervals are fitted to a proportion (rather than a rate) they are unnecessarily conservative since they are wider by a factor of approximately 1/q than those based on the binomial distribution.
AssumptionsThe key assumption here is that outcomes are independent. In a survey this assumption is assumed to be met if a sample is obtained by simple random sampling. In other words members of the sample have been drawn independently with equal probabilities. In an experimental situation, units should be allocated randomly to treatment and observations must be independent. The latter can be difficult to ensure and we return to this point in the next If you have used stratified random sampling to obtain your overall proportion, a normal approximation confidence interval is estimated in a similar way as for a single random sample, except that the standard error is weighted according to the proportion each stratum makes up of the total. If you have used cluster sampling to obtain your overall proportion, the approach is quite different. The standard error is estimated from the variability between sample proportions in the same way as the standard error of a mean is estimated. If the proportions lie outside the range 0.3 to 0.7 an arcsin transformation should be used. These methods are detailed in the More Information page on Sampling Methods in
Use of any of the normal approximation methods assumes that the proportion is distributed normally. This is commonly (but wrongly) taken to be the case if pqn is greater Cochran Newcombe
Related topics :Estimating sample size
