Our second worked example is from a study by Greenwood & Yule (1920) that we first looked at in Unit 4. The table gives the observed frequency distribution of accidents per individual, together with expected distributions assuming either a Poisson or negative binomial distribution.
Accidents experienced by 414 machinists over 3 months 
No. accidents 
f_{i} 
_{i} Poisson 
_{i} Negative binomial. 
0 1 2 3 4 5 6 7 8 
296 74 26 8 4 4 1 0 1 
256 122 30 5 1 0 0 0 0 
299 69 26 11 5 2 1 1 0 
 
Number of accidents is a discrete variable, so either Pearson's chi square test or the G likelihood ratio test would be appropriate to assess goodness of fit. To avoid expected frequencies less than 5, we pool the higher categories as appropriate.
Poisson distribution
Pearson's X^{2} can then be calculated using the general formula:
Using
X^{2} = 
(296 − 256)^{2} 
+ .... 
(18 − 6)^{2} 
= 49.67 


256 
6 
 
The number of degrees of freedom is given by [number of classes  1  number of parameters estimated] which, because a Poisson distribution has just one parameter, in this case is [511] =3. Using R's inverse chisquared probability function, pchisq(49.67, 3, low=FALSE), gives an upper tail Pvalue of 9.392416e11. We can therefore conclude that the observed distribution deviates significantly from a Poisson distribution.
Negative binomial distribution
Pearson's X^{2} can then be calculated using the general formula:
Using
X^{2} = 
(296 − 299)^{2} 
+ .... 
(10 − 9)^{2} 
= 1.32 


299 
9 
 
The number of degrees of freedom is given by [number of classes  1  number of parameters estimated] which, since the negative binomial requires TWO parameters to be estimated (m & k, or p & q), in this case is [512] =2. Referring this value to R, gives a Pvalue of 0.516. We can therefore conclude that the observed distribution does not deviate significantly from a negative binomial distribution. Note the phrasing here  we have not proved that the negative binomial is a 'significantly good fit' to the data, as we cannot prove the null hypothesis.