Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)




Capture-recapture methodology

for estimating total number of cases


If you have two sources of cases, and only some of the cases appear on both lists, you can estimate the total number of cases (including those missed by both sources) by using the capture-recapture methodology.

The principle of the method is simple. Say your two sources for assessing the number of cases are (a) hospital records (n1) and, (b) the records of a charity organization (n2). The method below assumes these are two random samples of the same population of cases. Given this assumption is reasonable, we then look at the number of cases common to each list. The total 'population size' of cases, N, is then given by:

Estimated total number of cases (N) =   n1 × n2
n1 + n2

or after correcting for bias:

Estimated total number of cases (N) = {(n1 + 1)   (n2 + 1)} -1
n1 + n2 + 1
  • n1 is the number of cases from source 1
  • n2 is the number of cases from source 2

Notice that, even where samples are random, this estimate needs to be corrected for bias (we consider the problems of biased estimators at many points in this course). Unfortunately it is highly debatable whether the 'random' assumption made above is ever valid. However, on rare occasions the lists may approximate to random samples, and thus the method will give you a better estimate than can be obtained any other way.

The rationale behind this approach is covered in the Related Topic on Mark-release-recapture for estimating population size.