InfluentialPoints.com Biology, images, analysis, design... 

"It has long been an axiom of mine that the little things are infinitely the most important" 

Validating nominal data Sensitivity, specificity and related measuresOn this page: Sensitivity, specificity & predictive values Overall accuracy, Cohen's kappa & likelihood ratios Cutoff values & receiver operator characteristic, ROC, curves Software packagesSensitivity, specificity and predictive valuesWorked exampleThe data given here show the ELISA optical density readings for a total of 82 individuals. Of these the gold standard test indicated that 45 were infected whilst 37 were uninfected.
The manufacturer of the ELISA test recommended a cutoff value of 20 optical density units  in other words any individual with a reading below or equal to 20 should be taken as negative; any individual with a reading above 20 should be taken as positive. We will first follow the recommendations of the manufacturer and use a cutoff value of 20. This would give the following values for specificity, sensitivity and predictive values:
Overall accuracy, Cohen's kappa and likelihood ratiosWorked exampleUsing the same data as above, we now calculate the overall accuracy for this diagnostic test:
Overall accuracy = This gives a biased estimate of accuracy because we have not corrected for chance agreement. In order to correct the estimate, we first work out expected values for this table assuming there is no association between the test result and the true status. This is done by multiplying the proportions obtained from the column totals by the row totals as below:
The corrected measure of agreement (kappa) is then given by κ = Hence in this case, with a kappa value of 0.78, we still have a very good level of agreement even after correcting for chance agreement.
Another useful measure of overall accuracy is the likelihood ratio. We can calculate the positive and negative likelihood ratios from the values for sensitivity and specificity:
The positive likelihood ratio is well above 1, and the negative likelihood ratio is close to 0, indicating that the test has reasonably good discriminatory power.
Cutoff values and receiver operator characteristic curvesWorked exampleIn the example above we used the recommended cutoff value of 20 optical units. There are a number of different ways to select the cutoff value for a diagnostic test. We first briefly consider two of the older, more arbitrary methods of doing this. They are based only on the distribution of test values in healthy uninfected individuals. The distribution of values in healthy individuals is assessed, for example with a histogram. Any value greater than the 95th percentile of uninfected individuals is considered abnormal; that value is therefore taken as the cutoff value. If however the distribution of known negatives was taken to be Whilst these methods are simple, there is no biological basis for defining cutoffs on this basis. We would instead recommend use of ROC plots as detailed below. Worked exampleA much better way to select the cutoff value is to use the ROC curve. It also provides an excellent measure of overall accuracy  the area under the curve (AUC). The ROC curve for our example is given here. If falsenegatives and falsepositives are equally undesirable, the optimal cutoff is that point closest to the upper lefthand corner of graph. In this graph that point lies at a cutoff of 20 to 25 units. If you wish to maximise sensitivity at the expense of specificity, a cutoff further to the right on the ROC curve should be selected. Moving the cutoff further to the left would maximise specificity at the expense of sensitivity. Estimating the area under the curve (AUC) can be done manually, but is very time consuming. Fortunately there are a number of software packages available that can do this for you  we give details of two such packages below. In this case the AUC is estimated at 0.89 indicating a good level of diagnostic accuracy. In recent years a different form of the ROC curve has become popular, especially in veterinary research. This is the twograph receiver operator characteristic curve (or twograph ROC curve). Here both sensitivity and specificity are plotted against the cutoff value. The cutoff (d_{0}) at which the lines cross (and hence sensitivity equals specificity) optimises the accuracy of the diagnostic test. If distributions are symmetrical, this point also maximises the mean value of sensitivity and specificity. In this plot, the optimal cutoff is identified as 22 optical units where both sensitivity and specificity approach 0.8. Here the line plots of estimated sensitivity and specificity values are fairly smooth. However, this is often not the case, and there are various methods used to provide smoothed curves. We consider these briefly below.
Software packagesWorked examplesVarious software packages are available free for carrying out ROC analysis. For, example:
This package has its strengths and weaknesses, namely
The cutoff value is estimated using two methods: The vertical lines represent the intermediate range for parametric and nonparametric estimates of the cutoff point. Any readings within the intermediate range marked on the graph can be regarded as borderline for clinical interpretation of the test result. Their position is set by the user, who selects desired levels of sensitivity and specificity. As with WinEpiscope this package has its strengths and weaknesses, namely:
