![]() Biology, images, analysis, design... |
|
"It has long been an axiom of mine that the little things are infinitely the most important" |
|
z-test for independent proportions: Use & misuse(independent proportions, risk difference, confidence interval of difference, critical ratio test, chi square test)Statistics courses, especially for biologists, assume formulae = understanding and teach how to do Use and MisuseThe purpose of the z-test for independent proportions is to compare two independent proportions. It is also known as the t-test for independent proportions, and as the critical ratio test. In medical research the difference between proportions is commonly referred to as the risk difference. The test statistic is the standardized normal deviate (z). The standard test uses the common pooled proportion to estimate the variance of the difference between two proportions. It is identical to the chi square test, except that we estimate the standard normal deviate (z). The square of the test statistic (z2) is identical to the Pearson's chi square statistic X2. It is sometimes preferred to the chi square test if the interest is in the size of the difference between the two proportions. A confidence interval can be attached to that difference using either the normal approximation or a variety of exact or small sample methods. Because different estimates of the variance are used, it is possible that the results of the test may not be consistent with the confidence interval. In other words, the confidence interval of the difference may overlap zero (indicating no significant difference), yet the test indicates a significant difference. As a result an alternative critical ratio test was devised that gives identical results to the confidence interval. This estimates the standard error of the difference as the square of the sum of the individual variances. When the test is used, it should therefore always be specified whether the variance of the difference is based on the pooled estimate of the common proportion (identical to Pearson's chi square test) or on the variance of the difference from the sum of the two individual variances (the more liberal alternative critical ratio test). Not surprisingly the most common misuses of the z-test are the same as for Pearson's chi square test. Use of paired samples also conflicts with the independence assumption - we give examples of before and after studies where the same sampling units are assessed and paired studies where different diagnostic tests are tested on the same samples. In some of these cases McNemar's test for significance of change would have been more appropriate. Pooling results from five different experiments may also invalidate the independence assumption especially if the data are heterogeneous. Certainly pooling across factors to reduce data to a 2 × 2 table (such as in a study on the conception rate of water buffaloes) is very unwise. Despite the fact that the test lends itself to estimating a confidence interval of the difference, it is rare to see the interval calculated - this is a pity as it is a much more informative approach than just quoting a P-value. Use of small samples is not uncommon in which case exact tests based on the distribution of X2 would be more appropriate. Another misuse is to use multiple z-tests to compare proportions in repeated measures designs. Lastly we note there is a strange predilection for always using a one-tailed tests in survival studies whether of rabbits, foxes or wild birds. The reason for this is debatable but it remains true that one should always justify one-tailed test a priori. What the statisticians sayLui (2004)![]() ![]() ![]() ![]() ![]() ![]() ![]() Santnera et al. (2007) Wikipedia uses the term absolute risk reduction
|