InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 

Models & study designs

This test is used to assess whether paired observations on two (usually nominal) variables are independent of each other. It thus enables us to determine if there is a significant difference between two independent proportions. The frequencies in each category are arranged in a contingency table. The test statistic is Pearson's chi square statistic (X2) as defined below. It's precise distribution depends on the sampling model.

Multinomial model

The original Pearson's chi square statistic assumes a multinomial model with only the total number of observations fixed. This can arise from two possible sampling designs:

    Characteristic
    B
    Characteristic ATotals
    PresentAbsent
    Present a ba+b
    Absentcdc+d
    Totalsa+cb + dn = a+b+c+d
    A single random sample is taken (analytical survey) and individuals are classified according to two characteristics. For example we might take a random sample of 2000 adult men aged 18-25 and determine whether each is married or single, and whether each is positive or negative for the HIV virus. We then compare the proportion of married men with the virus with the proportion of single men with the virus.

     Characteristic ATotals
    PresentAbsent
    Treatment 1 a ba+b
    Treatment 2cdc+d
    Totalsa+cb + dn = a+b+c+d

    Individuals are randomly allocated to two treatment groups (completely randomized experimental design) and in each group the frequencies with and without a particular characteristic are recorded. For example, individuals with malaria are randomly allocated to two treatment groups in which patients are given either drug A or drug B. The proportion of patients suffering neuropsychiatric side effects is compared between drug A and drug B. Note that in practice most experiments use some form of restricted randomization so that numbers in each treatment group are (more or less) fixed (see below).

Independent binomial model

In the second model either row or column totals are fixed (giving a double binomial model), but the other marginal totals are free to vary.

     Characteristic ATotals
    PresentAbsent
    Sample 1 a ba+b
    Sample 2cdc+d
    Totalsa+cb + dn = a+b+c+d
    Two random samples are taken (comparative area observational design) and in each sample the frequencies with and without a particular characteristic are recorded. For example, we take two random samples, one from a rural area and the other from an urban area, each of 1000 adult men. We then compare the proportion of infected men from rural areas with the proportion of infected men from urban areas. The same model applies for cohort or case-control designs, and randomized trials where restricted randomization is used to equalize group sizes for each treatment.

The exact distributions of X2 obtained under the two different models differ somewhat. However the asymptotic distribution of the statistic for both models is chi square with (r − 1)(c − 1) degrees of freedom - hence this is used for the large sample test.

Important point

In Unit 8 we analysed proportions in the situation where we had taken replicated samples from a population, and calculated percentages from each sample. That is the correct approach for handling replicated proportions (no transformation was used since p was close to .5).

We are dealing here with a quite different situation - namely where proportions are calculated either from a single random sample or from two independent samples. Under these circumstances, variability cannot be measured, but can only be estimated using the binomial distribution. Pearson's chi square test should not be used for analysing replicated proportions.

 

 

Large sample tests

General formula

The test statistic - X2 known as Pearson's chi square - can be calculated from the following general formula:

Algebraically speaking -

X2   =  Σ  (fii)2
i
Where:
  • X2 is Pearson's chi square statistic,
  • fi is observed frequency in cells a to d,
  • i is its expected frequency in cells a to d calculated from the marginal totals.
    For example, a = ((a + b)/n)(a + c)

This formula can also be used for goodness of fit tests and for contingency tables with more than two rows or columns.

 

For 2 2 contingency tables there is an alternative computational formula that is preferred as it is less subject to rounding errors:

Algebraically speaking -

X2   =  n (ad − bc)2
(a+b)(a+c)(b+d)(c+d)
Where:
  • X2 is Pearson's chi square statistic,
  • a, b, c, and d are the frequencies in each cell of the table as shown above,
  • n is the total number of observations.

Note that for the 2 2 table the formulations given above are mathematically identical to the square of the statistic obtained in the z-test for independent proportions.

The value of X2 is referred to the probability calculator on your software package, or to a table of χ2 values at (r − 1)(c − 1) degrees of freedom, where r is the number of rows in the contingency table, and c is the number of columns. Thus for a two by two contingency table there is always just one degree of freedom.

When the chi square test is used as a test of association it is naturally two sided since the null hypothesis is of no association versus the alternative of some association. However, when it is being used to compare two proportions (in other words for a 2 2 table), a one-sided test might be required. This is obtained by simply halving the P-value given by Pearson's chi square statistic.

Correction for continuity

For small sample sizes, many - but not all - statisticians feel that a correction for continuity should be applied. This is because a continuous distribution (chi square) is being used to represent the discrete distribution of sample frequencies. The Yates correction to the general formula is achieved by subtracting 0.5 from the modulus of each difference between observed and expected values:

Algebraically speaking -

X2c   =  Σ  (|fii| − 0.5)2
i
Where:
  • X2c is the corrected Pearson's chi square statistic,
  • fi is observed frequency in cells a to d,
  • i is its expected frequency in cells a to d calculated as above.

The Yates correction for the computational formula is shown below:

Algebraically speaking -

X2c   =  n (|ad − bc| − n/2)2
(a+b)(a+c)(b+d)(c+d)
Where:
  • X2c is the corrected Pearson's chi square statistic,
  • a, b, c, and d are the frequencies in each cell of the table as shown above,
  • n is the total number of observations.

The correction is usually only recommended if the smallest expected frequency is less than 5. Note that the correction should not be applied if |ad − bc| is less than n/2.

 

The 'n − 1' chi-square test

In 1947 Pearson recommended a third version of the chi-square test where n in the computational formula for the 2 × 2 table is replaced by n − 1.

Algebraically speaking -

X2   =  (n − 1) (ad − bc)2
(a+b)(a+c)(b+d)(c+d)
Where:
  • X2 is Pearson's chi square statistic,
  • a, b, c, and d are the frequencies in each cell of the table as shown above,
  • n is the total number of observations.

We note that one statistical package (EpiInfo) describes this as the Mantel-Haenszel chi-square test although this usage of the term is not recommended.

 

 

Exact tests using the X2 statistic

Multinomial
For the model where neither row nor column totals are fixed, the exact distribution is obtained from the multinomial distribution. However it is computationally demanding because of the many possible tables. For example even with n = 4 there are 35 different contingency tables. Nonetheless it is possible to carry out an exact test for very small tables (see Conover (1999) pp. 206 - 209) and a few statistical packages provide this.

Independent binomial
For a model with either row or column totals fixed, the position is somewhat easier since there are fewer possible tables. For example with 2 observations in each population (again n = 4) there are only 9 different contingency tables. The exact distribution of X2 is given by multiplying together the probabilities for each population obtained from the binomial distribution (see Conover (1999) pp. 185 - 187).

Monte Carlo solutions
An alternative approach is to use a Monte Carlo approach to simulate the distribution of the statistic for each of the two models. Base R's chisq.test function only provides a Monte Carlo simulation test for X2 using the third model where both row and column totals are fixed. This is not very useful as this model is rarely used and is anyway covered by Fisher's exact test. Hence we have provided a set of 4 functions below which enable R to give you P-values of X2 tests, where the distribution of the X2 statistic is estimated from random samples from either one or two null populations.


Doing an exact X2 test with R

We used this approach to compare our result with that given by Ludbrook (2008) for the independent binomial model (termed by Ludbrook the comparative trial or singly conditioned 2×2 table). Group 1 had 14 dead and 9 alive, so p1 = 0.6087. Group 2 had 17 dead and 2 alive so p2 = 0.8947. Ludbrook considered that the P-value of 0.044 obtained by the package Testimate for the single-conditioned option for exact tests on an odds ratio of 1, a risk ratio of 1 and a risk difference of 0 was acceptable. The P-value of 0.0391 obtained by StatXact for exact tests on a risk ratio of 1 and a risk difference of 0 was also deemed acceptable. Our exact Monte Carlo X2 test for these data with one million replicates gave a similar P-value of 0.0381.

 

 

Assumptions

Sampling or allocation is random

For model 1 (multinomial).
  • A simple random sample is taken and each observation is classified into one of two different categories for each of two characteristics or
  • Individuals in a group are (completely) randomly allocated to receive either treatment A1 or treatment A2, and are then classified into one of two different categories for characteristic B. For model 2 (double binomial)
  • Two independent samples are taken randomly and each observation is classified into one of two different categories for characteristic A or
  • Individuals in a group are allocated to receive either treatment A1 or treatment A2 using restricted randomization to equalize group sizes, and are then classified into one of two different categories for characteristic B

    Observations are independent

    Observations are assumed to be independent of each other. This assumption is not met if (for example) samples are obtained from clusters, or cluster randomization is used, and the test is then used to analyze results at an individual level. However, there is an approximate correction which can be applied to the chi square test for used with cluster samples which we cover in Unit 10. Nor is the test appropriate for analysing contingency tables derived from pooled samples. The analysis of sets of 2 2 contingency tables is dealt with in a related topic above.

    Errors are normally distributed

    Both models assume errors are normally distributed. Providing the cell frequencies are reasonably large, cell values in a 2 2 table will be distributed normally about their expected values. If any expected frequency is less than 5, then providing you want a conventional P-value, the continuity correction should be applied. Omission of the continuity correction will give you a mid-P-value. For very small sample sizes the conventional wisdom has been to use Fisher's exact test, although use of an exact test based on the correct model is now preferred.

    Mutual exclusivity

    A given case may fall only in one class.

    Related
    topics :

    G likelihood ratio test

    r c tables & partitioning

    Chi square test for trend

    Comparing survival rates

    Multiple 2×2 tables

    Measuring agreement

  •