InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 

This More Information page describes several closely-related tests, namely the binomial test, the sign test, McNemar's test and the Cox & Stuart test for trend. All these tests are essentially variants of the binomial test, although they are used for different types of data.

The binomial test

The binomial test is a one-sample test used to assess whether an observed proportion derived from a single random sample differs from an expected parametric proportion. It can be used to assess outcomes of encounters in behavioural studies. For example to assess whether the probability that a male already resident in a territory wins a fight with a non-resident male can be accepted as 0.5 or not. Another example comes from choice experiments. Odour may be presented in one arm of an olfactometer and not in the other. The probability of an insect choosing one arm or the other can be tested against a null hypothesis of P = 0.5 to establish whether the odour is attractant, repellent or has no effect.

  Large-sample approximation

To carry out the large sample version of the test (n > 25), we first calculate the difference between our sample proportion and the parametric value. We then convert this difference to a standard normal deviate by dividing it by the estimated standard error of the proportion under your null hypothesis. The standard error of a proportion is given from the binomial distribution as √PQ/n, where P is the true proportion, Q is 1−P and n is the number of observations. Thus

Algebraically speaking -

z  =    p − P
PQ/n
Where:
  • z is a quantile of the standard normal distribution,
  • p is your sample proportion,
  • P is the true proportion under the null hypothesis,
  • Q is equal to 1 − P,
  • n is the number of observations in your sample.

The z-test assumes that the distribution of the estimate is continuous. However the estimate of a proportion can only take a finite number of values depending on the sample size, and so the distribution is discrete. Some statisticians argue that you should therefore use a continuity correction, namely:

Algebraically speaking -

z  =    |p − P| −1/2n
PQ/n
Where all variables are defined as above.

  'Exact' test

For small samples you need to work out probabilities directly using the mass probability function for the binomial distribution. Under conventional inference the probabilities of the observed result and of results more extreme are summed to give the required probability. Alternatively if n is 25 or smaller and the population proportion (P) is 0.5, probabilities can be found in the appropriate table. To convert that to a mid-P-value you subtract half the probability of obtaining the observed result. In either case, for a 2-tailed test that P-value is doubled.

The assumptions of the binomial test are:

  1. The n samples are mutually independent,
  2. The probability of a given outcome is the same for all n samples.
  3. The only source of variation is simple random and binomial.

 

 

The Sign test

The sign test is used for paired data where quantitative measurements are not possible, but where it is possible to rank each member of a pair (or the same individual before and after a treatment) for some characteristic with respect to each other. In other words measurement is only possible on the ordinal scale. It also has an important use for testing paired data where quantitative measurements are possible, but where the distribution of differences is neither normal (where you could use the paired t-test) nor symmetrical (where you could use Wilcoxon's matched pairs signed ranks test).

We might for example compare preliminary results on species richness between two types of habitat. We do not feel we have sufficient data to precisely quantify species richness in each habitat type - but we do have sufficient data to say which habitat has the greater species richness. Those cases where habitat type 1 has a greater species richness than habitat type 2 are classified as +, those where the converse is true as −, and those where there is no difference are dropped from the analysis.

The sign test tests the null hypothesis (H0) that the number of pairs where habitat type 1 has a greater species richness than habitat type 2 is the same as the number of pairs where the converse is true. In other words the null hypothesis is that the probability of each is equal to 0.5. The sign test is thus equivalent to the binomial test where the parametric proportion is equal to 0.5.

Again for small samples you need to work out probabilities directly using the mass probability function for the binomial distribution. The probabilities of the observed result and of results more extreme are summed to give the required probability. The large sample test is given below:

Algebraically speaking -

z  =    x − np   =    x − 1/2m
√mpq ½√n
Where:
  • z is the z-statistic which is compared to the standard normal deviate,
  • x is the frequency of the larger number of signed observations,
  • m is the number of paired scores where the difference has a sign (that is the total number of pairs (n) less the number of tied or unsigned observations).

 

Again you may wish to apply a continuity correction to obtain conventional P-values:

Algebraically speaking -

z  =    (x − 0.5)− 1/2m
½√m

Where all variables are defined as above.

 

The assumptions of the sign test are:

  1. Each pair can be designated as a 'plus', a 'minus' or an unsigned 'tie'.
  2. The bivariate random variables are mutually independent,
  3. The probability of a given outcome is the same for each pair.

 

 

McNemar's test (for significance of change or for matched pairs)

Subject no.Infection status
BeforeAfter
1
2
3
4
5
6
7
8
9
10
+
+


+
+
+

+
+





+

+

McNemar's test is essentially the sign test under another name but applied when the response variable is binary. It is therefore used to compare paired proportions. This arises in several types of study designs such as 'before-and-after' studies and matched cohort or case-control designs. Say we are carrying out a 'before-and-after' study on 10 individuals which gives the following result:

Such data are often wrongly presented in a 2 by 2 contingency table thus:

How not to do it for paired observations!
Time periodInfectedUninfected
Before73
After28

Correct tabulation for before and after study 
BeforeAfter
InfectedUninfected
Infected16
Uninfected12

Such a presentation looses the information on the individual pairs. Instead data from matched pairs studies should be presented so you can see the number of concordant pairs (each with the same characteristics) and the number of discordant pairs (each with different characteristics). To distinguish these from unpaired data we use e,f,g,h rather than a,b,c,d

Correct tabulation for matched case control study 
Case exposed
to risk factor
Control exposed to risk factor
YesNo
Yesef
Nogh

McNemar's test is also used for observational studies where each case is matched to one control. Here a 2 × 2 table is used to display whether each member of each pair is exposed or not to the risk factor.

The same simplified version of the binomial test formula for P=0.5 is used as in the sign test:

Algebraically speaking -

z  =    f − m/2   =    f − g
½√m √(f + g)
Where:
  • z is the z-statistic which is compared to the standard normal deviate,
  • f is the number of individuals changing from present to absent
  • g is the number of individuals changing from absent to present
  • m is f+g, the total number of individuals who either changed from absent to present or present to absent.

 

Again we can apply a continuity correction if desired, namely:

Algebraically speaking -

z  =    |f − g|− 1
√(f + g)

We have used the standardised normal deviate as the test statistic for McNemar's test to demonstrate that it is derived from the sign test. However, you will also come across the test where X2 is used as the test statistic and tested against χ2 with one degree of freedom. For this we simply square the expression used to obtain 'z'.

Again for small samples you need to work out probabilities directly using the mass probability function for the binomial distribution. The probabilities of the observed result and of results more extreme are summed to give the required probability.

The significance test is based entirely on the number of changes and does not take account of total sample size (e+f+g+h). However, to attach a confidence interval to the magnitude of the change, we must estimate the standard error to the difference between the proportions - and this does take account of total sample size:

Algebraically speaking -

SEp1−p2  =    1  
f + g −  (f − g)2
n n
Where:
  • SEp1−p2 is the standard error of the difference between the proportions p1 − p2 or a/n − c/n or f/n − g/n,
  • f is the number of individuals changing from present to absent,
  • g is the number of individuals changing from absent to present,
  • n is e+f+g+h, the total number of individuals, or the number of paired observations.

To apply a continuity correction add 1/n to that standard error.

This method is adequate for large samples, but otherwise performs poorly. There is a conventional conditional exact conditional interval, but it yields an interval which is prone to artefacts and has poor coverage. Exact unconditional intervals, profile likelihood intervals, and score intervals have better properties but are seldom used. See Newcombe (1998) for more on estimating a confidence interval for the difference between paired proportions.

The assumptions of the McNemar test are:

  1. The measurement scale is nominal with only two categories for each member of the pairs
  2. The pairs of observations are mutually independent.

 

 

Cox and Stuart test for trend

An interesting, if little used, application of the sign test is to test for trend. Under certain circumstances it can be used to determine if there is a trend in observations on a sequence of random variables.

One possible application would be to test the trend in mean maximum temperature over a period of say 100 years. The data are divided up into pairs (Y1, Y1 + c), (Y2, Y2 + c) etc where c = n/2 if n is even, and c = (n+1)/2 if n is odd. For example, for c=100/2=50 years, the year 1 observation would be paired with the year 51 observation, year 2 observation would be paired with the year 52 observation and so on. The middle observation is eliminated if there is an odd number of data points. If the reading is greater for the second observation in the pair, it is denoted as a '+'; if less it is a '−'. Ties are eliminated from the data.

The formula used is the same as for the sign test. The upper-tailed test is used to detect an upwards trend; the lower-tailed test to detect a downwards trend. A two-tailed test if the alternative hypothesis is that there is any type of trend. Assumptions are as follows:

  1. The measurement scale is at least ordinal
  2. The random variables are mutually independent.
  3. Values are either identically distributed or there is a trend.