 
Sensitivity, specificity and predictive values
Worked example
Optical density  True status 
Infected  Uninfected 
0610  0  3 
1115  1  9 
1620  4  15 

2125  8  7 
2630  14  2 
3135  9  1 
3640  7  0 
4145  2  0 
The data given here show the ELISA optical density readings for a total of 82 individuals. Of these the gold standard test indicated that 45 were infected whilst 37 were uninfected. The manufacturer of the ELISA test recommended a cutoff value of 20 optical density units  in other words any individual with a reading below or equal to 20 should be taken as negative; any individual with a reading above 20 should be taken as positive.
We will first follow the recommendations of the manufacturer and use a cutoff value of 20. This would give the following values for specificity, sensitivity and predictive values:

Test result  True status (from gold standard)  
Infected  Uninfected  Total 
+  40  10  50 
  5  27  32 
Totals  45  37  82 
Sensitivity  =  ^{40} / _{45}  =  0.89 

Specificity  =  ^{27} / _{37}  =  0.73 

Positive predictive value  =  ^{40} / _{50}  =  0.80 

Negative predictive value  =  ^{27} / _{32}  =  0.84 


Overall accuracy, Cohen's kappa and likelihood ratios
Worked example
Using the same data as above, we now calculate the overall accuracy for this diagnostic test:
Test result  True status (from gold standard)  
Infected  Uninfected  Total 
+  40  10  50 
  5  27  32 
Totals  45  37  82 
Overall accuracy = ^{(40 + 27)} / _{82} = 0.89
This gives a biased estimate of accuracy because we have not corrected for chance agreement.
In order to correct the estimate, we first work out expected values for this table assuming there is no association between the test result and the true status. This is done by multiplying the proportions obtained from the column totals by the row totals as below:
Expected values for no association 
Test result  True status (from gold standard)  
Infected  Uninfected  Total 
+  27.4 (^{50}/_{82} × 45)  22.6 (^{50}/_{82} × 37)  50 
  17.6 (^{32}/_{82} × 45)  14.4 (^{32}/_{82} × 37)  32 
Totals  45  37  82 
Chance level of agreement (p_{E})  =  ^{(27.4 + 14.4)} / _{82} 
=  0.51 
The corrected measure of agreement (kappa) is then given by
κ = [0.89 − 0.51]/[1 − 0.51] = 0.78
Hence in this case, with a kappa value of 0.78, we still have a very good level of agreement even after correcting for chance agreement.
Another useful measure of overall accuracy is the likelihood ratio. We can calculate the positive and negative likelihood ratios from the values for sensitivity and specificity:
Sensitivity  =  0.89 

Specificity  =  0.73 

Positive likelihood ratio = ^{0.89} / _{(1 − 0.73)} = 3.3
Negative likelihood ratio = ^{(1 − 0.89)} / _{0.73} = 0.15
The positive likelihood ratio is well above 1, and the negative likelihood ratio is close to 0, indicating that the test has reasonably good discriminatory power.


Cutoff values and receiver operator characteristic curves
Worked example
In the example above we used the recommended cutoff value of 20 optical units.
But was this the best value to use?
There are a number of different ways to select the cutoff value for a diagnostic test. We first briefly consider two of the older, more arbitrary methods of doing this. They are based only on the distribution of test values in healthy uninfected individuals.
{Fig. 2}
The distribution of values in healthy individuals is assessed, for example with a histogram.
Any value greater than the 95th percentile of uninfected individuals is considered abnormal; that value is therefore taken as the cutoff value. If however the distribution of known negatives was taken to be 'normal', and the mean and standard deviation were computed, the 95th percentile is equivalent to the mean plus twice the standard deviation. In this example this criterion, quite arbitrarily, sets the cutoff at 33.5.
Whilst these methods are simple, there is no biological basis for defining cutoffs on this basis. We would instead recommend use of ROC plots as detailed below.



Worked example
{Fig. 3}
A much better way to select the cutoff value is to use the ROC curve. It also provides an excellent measure of overall accuracy  the area under the curve (AUC). The ROC curve for our example is given here. If falsenegatives and falsepositives are equally undesirable, the optimal cutoff is that point closest to the upper lefthand corner of graph. In this graph that point lies at a cutoff of 20 to 25 units.
If you wish to maximise sensitivity at the expense of specificity, a cutoff further to the right on the ROC curve should be selected. Moving the cutoff further to the left would maximise specificity at the expense of sensitivity.

Estimating the area under the curve (AUC) can be done manually, but is very time consuming. Fortunately there are a number of software packages available that can do this for you  we give details of two such packages below. In this case the AUC is estimated at 0.89 indicating a good level of diagnostic accuracy.
In recent years a different form of the ROC curve has become popular, especially in veterinary research. This is the twograph receiver operator characteristic curve (or twograph ROC curve). Here both sensitivity and specificity are plotted against the cutoff value. The cutoff (d_{0}) at which the lines cross (and hence sensitivity equals specificity) optimises the accuracy of the diagnostic test. If distributions are symmetrical, this point also maximises the mean value of sensitivity and specificity.
{Fig. 4}
In this plot, the optimal cutoff is identified as 22 optical units where both sensitivity and specificity approach 0.8. Here the line plots of estimated sensitivity and specificity values are fairly smooth. However, this is often not the case, and there are various methods used to provide smoothed curves. We consider these briefly below.


Software packages
Worked examples
Various software packages are available free for carrying out ROC analysis. For, example:
  WinEpiscope 2.0 may still be available from www.clive.ed.ac.uk
Enquiries concerning the legacy catalogue should be addressed to: clive@ed.ac.uk
  CMDT 1.0 (and two other software packages) were available from:
http://www.vetschools.co.uk
 WinEpiscope
In the 'Cutoff value' module, the number of animals that are really positive or negative (true status) are entered for each value of antibody titre. The package displays the distributions of titres for infected and uninfected animals as seen in the first figure below. The cutoff value can then be scrolled to see the effect on sensitivity, specificity and predictive values. Confidence intervals are also given. The module then provides the ROC curve and calculates the area under curve (AUC). This is shown in the second figure below.
{Fig. 5}
This package has its strengths and weaknesses, namely
 it is very easy to use, and data can be entered directly to the package but:
 it will only accept a rather small number of values (20) for evaluating the cutoff value;
 it does not show the cutoff values on the ROC curve (for rational selection of the cutoff);
 it does not do twograph ROC curves


 CMDT
CMDT was developed to aid selection of cutoff values and evaluation of quantitative diagnostic tests. Data are most readily entered to the program as Microsoft Excel spreadsheets. Both standard and twographROC analyses can be done. The figure below shows the output for the two graph ROC using the same data as for WinEpiscope.
{Fig. 6}

The cutoff value is estimated using two methods:
 In the nonparametric method, the range of observations is divided into intervals by 250 equidistant points. For each of these points the corresponding values of sensitivity and specificity are calculated. The cutoff point, d_{0}, is determined from these values.
 In the parametric method, it is assumed that relationships between sensitivity or specificity and optical density can be fitted by cumulative normal distribution functions  rather than using percentiles. The cutoff point lies at the intersection of these two curves.
Since the distributions here do approximate to normal, we have given the d_{0} for the parametric method.
The vertical lines represent the intermediate range for parametric and nonparametric estimates of the cutoff point. Any readings within the intermediate range marked on the graph can be regarded as borderline for clinical interpretation of the test result. Their position is set by the user, who selects desired levels of sensitivity and specificity.
As with WinEpiscope this package has its strengths and weaknesses, namely:
 it will accept large data sets, and provides a comprehensive set of analyses but:
 it will only accept data in raw form, unlike the frequency table data accepted by WinEpiscope;
 it is not as easy to use (not least because the version we downloaded did not include help files!)


