Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

Search this site



Number of cases

The measures we look at in this section are all measures of morbidity, the amount of disease in the population. The simplest way to measure disease frequency is to make counts of individuals which are infected or diseased. If you count all the cases present at a given point in time they are known as prevalent cases. If you only count the number of new cases, they are known as incident cases. The progression of an epidemic is most commonly followed by recording the number of incident cases of the disease in question. A plot of the number of incident cases against time is known as epidemic curve.

Obtaining an accurate count of the number of cases can sometimes be very difficult because of the problem of measurement error. If you are using a diagnostic test to identify cases, this is commonly expressed as the sensitivity and specificity of the test. The sensitivity of a test is the proportion of true positive cases that are correctly identified as positive. The specificity of a test is the proportion of true negative cases that are correctly identified as negative. We look at specificity and sensitivity in more depth in Unit 2. These criteria apply whatever method you use to identify events - whether with a test, by clinical symptoms or by government records. Very often apparent changes in the number of cases may result from a change in the proportion of cases reported (that is the sensitivity of the reporting), rather than a change in the actual number of cases.

Sometimes you may get more than one source providing information on the number of cases. If you have two sources, you would expect them to record the same cases. But what if only some of the cases appear on both lists? You can get an estimate of the total number of cases (including those missed by both sources) by using the capture-recapture methodology that is used by ecologists for estimating the population size of mobile organisms. See the related topic on Capture-recapture methodology for more details.

Although information on the number of cases is useful for some purposes, we are often more interested in expressing the counts as a fraction of the total number of individuals capable of becoming infected. This is done by calculating either the prevalence or incidence.




The proportion of a population affected by a disease at a given point in time is known as the prevalence.

Prevalence  =  Number of cases at point in time
Total population size

We can view this as the probability of an individual in that population having the disease at that point in time. By specifying a single point in time, we have strictly speaking given the definition for the point prevalence. If we use the number of cases over a period of time as the numerator, it is termed a period prevalence. If we use the number of cases experienced over a lifetime, it is called a lifetime prevalence. Prevalence can range from 0 to 1, or 0-100 if expressed as a percentage. Note that the number of cases used to work out prevalence includes both old and new cases.

What we have defined above is the individual-level prevalence. But, especially in veterinary and wildlife research, we can also calculate a herd-level prevalence. This is defined as the number of infected herds divided by the total number of herds. Usually a herd is taken as 'infected' if any animal in the herd is affected, although sometimes a threshold of say 1% or 5% is set for a herd to be considered infected.

Although the term prevalence is usually taken to mean the proportion infected with a disease, it can be used to mean the proportion with any other characteristic. Seroprevalence is the proportion of individuals seropositive for a particular antigen - this is not usually the same as 'infected', since animals may remain seropositive long after active infection has cleared up.

Several other terms are commonly used to describe prevalence, the most common of which are infection rate and percentage parasitism. Infection rate may be used to indicate the proportion of infected insect disease vectors in a sample - such as the sporozoite infection rate in mosquitoes. Percentage parasitism, or parasitism rate, is often used to indicate the proportion of parasitized individuals in a sample. Neither of these are 'rates' in the true sense of the word (in other words, a change per unit time), and prevalence is the preferred term. A term sometimes used to describe the proportion of insects testing positive for a virus, at a point in time, is infection prevalence.

These measures work reasonably well for rare diseases. More sophisticated models allow for the fact that not all the population are susceptible - either because they are immune, or have already been infected.



Cumulative incidence

The proportion that new cases of a disease make up of the population at the start of the period is known as the cumulative incidence. It is also termed the incidence risk (or incidence proportion), since it can be viewed as the risk of an individual developing the disease over that period.

Cumulative incidence  =  Number of new cases over a period of time
Population size at start of period

For example, if 2 out of 5 previously uninfected individuals become infected in ten days, the cumulative incidence for that period is 0.4 (or 40%). It is assumed that all individuals are disease-free at the start of the period. As with prevalence, the cumulative incidence is a proportion - and can therefore only range from 0 to 1 (or between 0 and 100%).

Provided the risk stays constant, we can work out what the cumulative incidence would be over longer periods. However, it is not just a matter of multiplying up. If it were, you would rapidly get cumulative incidences greater than one, which is impossible. Once an animal is infected, it is no longer at risk. Hence, over time, the cumulative incidence will increase in a curvilinear (sigmoidal) way - but never quite reach one. A formula which yields this pattern of cumulative incidence, for a longer (or shorter) time period, is given below:

C1 = 1 − [1 − C2](t1 / t2)
  • C1 is the cumulative incidence you wish to estimate
  • C2 is the cumulative incidence you have measured
  • t1 is the time period for which you wish to estimate the incidence,
  • t2 is the time period over which you measured the incidence.

If for example, we wanted to know the cumulative incidence for 60 days, based on an incidence of 0.4 per 10 days, it works out at 0.953 per 60 days - in other words most of the individuals would become infected. The cumulative incidence for only one day would of course be much smaller - it works out at 0.049 per day.

Rather than working out the proportion of the population that becomes infected, we can also work out the proportion that remains uninfected. This is termed the cumulative survival - a term that is also used when the event of interest is not infection, but death - as you can see in the More Information Page on Measures of Population Change.



Incidence rate

The incidence rate or incidence density measures the rate at which new cases of disease develop over time. Instead of dividing the number of new cases by the number at risk, we divide by the total time at risk for all individuals.

Incidence rate  =  Number of new cases over a period of time
Total time at risk for all individuals

The total time at risk is best understood with a simple example. Say we start with five animals - one acquires disease after two days, another after five days, and the remaining three are disease free - up to the end of the study (say 10 days). To get the total time at risk we add together the number of days each individual is at risk - namely 2+5+10+10+10 = 37. The incidence rate is then two cases during a total of 37 days at risk, which is 2/37, or 0.054 per day. This is the exact method to calculate time at risk.

An approximate method of estimating the incidence rate is to divide the number of cases by the average number at risk multiplied by the time period. This is done when the exact time of infection is unknown.

Incidence rate  =  Number of new cases over a period of time
Average number at risk  ×  time period

Since the incidence rate is a rate (that is per unit time) and not a proportion, it can exceed one. Note that the incidence rate is always somewhat larger than cumulative incidence - considerably so if the rate is high. This is because the denominator is always smaller, being the average number of animals at risk during the period, rather than the number alive at the start of the period. However, as the rate decreases, the difference between cumulative incidence and the incidence rate also decreases. So for values of the incidence rate less than 0.1 the two measures will be very similar.

When you calculate the incidence rate you are estimating what is known as the instantaneous incidence rate. This is a theoretical measure of the risk of occurrence of disease - which can be viewed as the potential of disease occurrence per unit of time. You can estimate the cumulative incidence from the instantaneous incidence rate, and vice versa, using the relationships below:

Ci    =    1 − e − Ii
Ii    =    loge[1 − Ci]
  • Ci is the Cumulative incidence,
  • Ii is the Instantaneous incidence rate, and
  • e is the 'base' of 'natural logarithms', or approximately 2.718

We look at instantaneous rates in more depth in the Related topic on finite and instantaneous rates in the More Information Page on Measures of Population Change.

A term used in relation to disease incidence, especially for vector-borne diseases, is challenge. This is usually defined as an index of the probable disease incidence - which is derived from the vector density (or biting rate), the prevalence of the disease in the vector, and other relevant factors. Ideally it should be possible to show a direct relationship between challenge and incidence of the disease. This has been done for example between trypanosomosis challenge (apparent density of tsetse flies trypanosome prevalence) and the daily probability of infection of cattle (as measured by the number of drug treatments required).



Duration of infection

Prevalence and incidence rate are related to each other through the duration of infection of a disease - defined as the mean time until death, or recovery. Say you observe an increase in point prevalence of the disease. This could have come about in two ways.

  1. The incidence rate may have increased, so there are more new cases occurring per unit time;

  2. The duration of infection may have increased, so that infected individuals remain in the population for longer.

Consider the current situation with HIV/AIDS. When there were no effective treatments available, the duration of infection was relatively short because cases died. In those countries where retroviral drugs are readily available, the duration of infection has now increased. This will tend to increase the prevalence of HIV/AIDS, regardless of trends in the incidence of the disease.

If we assume a stable population, together with a constant prevalence and constant duration of infection, we can estimate the incidence rate from the other two parameters:

Ir  =  Pr
(1 − Pr) Di

If prevalence is small (<0.1):
Ir  =  Pr

  • Ir is the Incidence rate,
  • Pr is the Prevalence, and
  • Di is the Duration of infection.

However, such assumptions are rarely justified. If we assume that incidence rates are stable over time and that the prevalence in young individuals increases linearly with age, we can estimate the incidence from the slope of the regression line of prevalence against age. Adjustments can be made to this simple model to allow for differential mortality of infected and uninfected individuals. Further adjustments have also been proposed which allow for a changing prevalence.


Intensity/severity of infection

The term mean intensity is usually defined as the number of parasites per infected host. The more general term severity is used for any measure of how serious the infection is. In terms of disease frequency, and monitoring the course of an epidemic, severity affects the number of cases that are noticed - whereas intensity may affect the number diagnosed. Non-severe, low intensity cases can be important in predicting the course of an epidemic - either as carriers, or because they become immune.

Related topics :

Standardized Event Rates

Capture-recapture methodology