InfluentialPoints.com
Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)

 

 
  1. Equal variance t-test

    Worked example

    Time (hours) from
    treatment to lambing
    Control Treated
    45
    87
    123
    120
    70
     
    51
    71
    42
    37
    51
    78
    51
    49
    56
    47
    58
     
    = 89.0
    s2= 1104.5
    = 53.7
    s2= 141.8

    This example uses hypothetical data on the effect of drug treatment on the length of time from treatment to lambing. It is based on a trial of the efficacy of dexamethasone to induce parturition in sheep. The design of this experiment is not ideal, with fewer animals in the control group than in the treatment group. Nevertheless, such designs are not uncommon in veterinary studies.

    We first look at the distributions for the each group of observations, shown in the plots below. The distributions deviate from normality both for raw and transformed data. We will therefore postpone a decision on whether to transform until we have tested for equality of variances - but remember that the F-ratio test will not be very reliable for this, since it is sensitive to departures from normality.

    Figmb1.gif

    We first carry out an F-ratio test on the variances of the untransformed data:

    F = 1104.500/141.818 = 7.79 with 4 and 10 degrees of freedom (P = 0.0081).

    Since the above result is highly significant, we carry out the same test on the variances of the log transformed data:

    F = 0.03284/0.00855 = 3.84 with 4 and 10 degrees of freedom. (P = 0.077)

    Using R
            F test to compare two variances
    
    data:  untreated and treated
    F = 7.7881, num df = 4, denom df = 10, p-value = 0.008108
    alternative hypothesis: true ratio of variances is not equal to 1
    95 percent confidence interval:
      1.74296 68.87739
    sample estimates:
    ratio of variances
              7.788141
    
            F test to compare two variances
    
    data:  log10(untreated) and log10(treated)
    F = 3.8382, num df = 4, denom df = 10, p-value = 0.07691
    alternative hypothesis: true ratio of variances is not equal to 1
    95 percent confidence interval:
      0.858972 33.944408
    sample estimates:
    ratio of variances
               3.83818
    

    The test on the transformed data gives a borderline but non-significant P-value. Since with skewed distributions the F-ratio test is too liberal in reporting differences, we are justified in assuming equality of variances for the transformed data.

    Log transformed data
    Group Mean Variance
    Untreated
    Treated
    1.9214
    1.7210
    0.03284
    0.00856

    Hence we proceed with a equal variance t-test on the log transformed data. Since our sample sizes are neither equal nor large, we use the general formula to obtain the best estimate of the t-statistic. We will use a null hypothesis of no difference.

     t df=14   =   (1.9214 − 1.7210) −(0)
    [ (4 0.03284) + (10 0.00856) ]( 5 + 11 )
     
    (5 + 11 − 2) 5 11
       =   2.984

    For a two-tailed test with a t-statistic of 2.984 with 14 degrees of freedom, P = 0.00986. This suggests that the treatment is providing a significant reduction in lambing time.

    The 95% normal approximation confidence interval around the (transformed) treatment effect is obtained as follows, and then detransformed to give the ratio (control/treated) of the geometric means:

    95% CI = 0.200 (2.145 0.0672) = 0.056 to 0.344 (transformed scale)

    Ratio = 1.58 (95% CI 1.138 to 2.208) (detransformed scale)

    Using R
    Two Sample t-test
    
    data:  log10(untreated) and log10(treated)
    t = 2.984, df = 14, p-value = 0.009858
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     0.05634678 0.34434251
    sample estimates:
    mean of x mean of y
     1.921383  1.721039
    

    We can conclude that the parturition time of untreated animals was on average 1.6 times (95% CI: 1.4 to 2.1) that of untreated animals. However, the small number of animals in the control group and the uncertainty over whether allocation was random means we cannot have much confidence in this result. The experiment should ideally be repeated with random allocation to similarly sized treatment groups.

     

     

  2. Unequal variance t-test

    Worked example

    Rather than transform the data above on the effect of drug treatment on the length of time from treatment, one might have been tempted to use the unequal variance t-test. Again we use a null hypothesis of no difference:

    t'  =   (89 − 53.727)     =   2.3069
    ( 1104.500 + 141.818 )
       
    511

    Then we use method 1 to estimate the corrected degrees of freedom for t':
    df   =   [(1104.5/5)  +  (141.818/11)]2   =  4.47
    (1104.5/5)2   +   (141.818/11)2
       
    410

    Published statistical tables reveal the P-value for t' = 2.3069 at 4 degrees of freedom is 0.0823. Hence the difference is not significant at the 5% level. We would get the same answer using Method 2, where the critical value is estimated as 2.7462.

    The 95% normal approximation confidence interval around the (non-significant) treatment effect is obtained as follows:

    95% CI = 35.273 (2.665 15.2903) = −5.46 to 76.01

    Using R
            Welch Two Sample t-test
    
    data:  untreated and treated
    t = 2.3069, df = 4.474, p-value = 0.07533
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     −5.462216 76.007671
    sample estimates:
    mean of x mean of y
     89.00000  53.72727
    

    We would therefore conclude from using the unequal variance test that we have no evidence of a difference between treatments for the time from treatment to lambing. In other words the opposite conclusion to that we reached when we used an equal variance t-test on log transformed data. The main reason for the apparent contradiction is that the unequal variance t-test can be very conservative if sample sizes in each group differ greatly. Transformation followed by an equal variance test is usually preferable providing a suitable variance stabilising transform can be found.

     

     

  3. Use of t-tests for a cross-over design

    Worked example

    Cross-over design
     IndivsTime periodDiffs
    12
    Sequence
    Group
    1
    1 43 (A1) 15 (A2) +28
    2 41 (A1) 19 (A2) +22
    3 44 (A1) 27 (A2) +17
    4 47 (A1) 26 (A2) +21
    Mean1,1 = 43.751,2 = 21.75d1 = 22
    Sequence
    Group
    2
    5 21 (A2)33 (A1) −12
    6 25 (A2) 39 (A1) −14
    7 23 (A2) 41 (A1) −18
    8 30 (A2) 40 (A1) −10
    Mean2,1 = 24.752,2 = 38.25d2 = −13.5
    We looked at the cross-over design in Unit 7. It is a within units design where one group of units receives treatment A1 followed by treatment A2, and the other group of units receives treatment A2 followed by treatment A1. One might be tempted to ignore the cross-over aspect of the design and just analyse such data with a paired t-test for all values of A1 - A2. However, this would only be valid if there were no period effect. If there is a period effect (say both treatments are more effective in the second period), then a simple paired t-test would give a misleading result. Instead one carries out a series of two-sample t tests.

    This example uses data gathered in a trial of two drugs used for pain relief for arthritis patients. Subjects were randomly allocated to two sequence groups. A sequence group is characterized by the order in which treatments are given. All subjects in sequence group (1) received treatment A1 followed later by treatment A2. All subjects in sequence group (2) received treatment A2 followed by treatment A1. The response variable was the level of pain experienced.

    We first display the results graphically. The two top figures below follow what happens to individual subjects through the trial. Irrespective of which order the drugs are given in, the pain score is lower when patients are on treatment A2 than when on A1.

    Figmb2.gif

    The lower figure shows mean responses. These suggest a slight period effect - pain levels decrease from period 1 to period 2 - but it appears to affect both drugs similarly. In other words there is probably no period treatment interaction. If there were an interaction, then the lines in the last figure would no longer be parallel and we could not analyse the data as a single cross-over experiment. To understand why this is so, and how to test these results, let us rearrange these means and their differences as follows:

    If we exclude random variation, the differences between these 4 means are due to a combination of the treatment effect, T, and period effect P.
    • T is the difference between treatment A1 and treatment A2
    • P is the difference between period 1 and period 2.
     
    Means and differences
    EffectPeriod
     1difference2
    Group11,1+P +T1,2
    difference+T T
    22,1+P T2,2

    In which case we would expect that 1,1+P = 2,2, and 2,1+P =1,2, which explains why the bottom graph's lines ought be parallel. Of course that simple relationship assumes T is the same in both periods, and that P&T are the same in both groups - in other words that these effects are both additive and independent.

    • If that assumption is correct we would expect no difference between group means. So 1,1+1,2 = 2,1+2,2 and any observed difference between them is due to simple chance.
    • Alternately, if these effects do interact, then we would expect a nonzero difference between group means - and can use a t-test of their difference to check for that interaction.

    Let us check for any interaction between time period and treatment (in other words if the relative efficacy of the two treatments is dependent on time period) by comparing the mean of the totals (A1+A2) of the two sequence groups.

    The mean of the subject totals for Group 1 = [58 + 60 + 71 + 73]/4 = 65.5
    The mean of the subject totals for Group 2 = [54 + 64 + 64 + 70]/4 = 63.0
    Comparing these two means with the equal variance t-test gives t = 0.496, df = 6, P = 0.638. Hence there is no significant interaction between period and treatment, and our analysis is valid.

    Using R
    Two Sample t-test
    data:  y1 and y2
    t = 0.4959, df = 6, p-value = 0.6376
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
     -9.836093 14.836093
    sample estimates:
    mean of x mean of y
         65.5      63.0 

      Note that this test for interaction cannot distinguish between a period x treatment interaction and a difference between sequence groups. So it is very important that subjects are initially assigned at random to the two groups. In addition, the test for an interaction has low power because we are comparing means of totals with correspondingly fewer degrees of freedom. Hence it is advisable to use a 10% rather than 5% significance level.

    Again provided P and T do not interact, excluding random variation, we would expect that:
    • since 1,1 - 1,2 = +P+T = d1
    • then  2,12,2 = +PT = d2
    • and d1 − d2 = P+T − (PT) = P+TP+T = 2T

    • Also d1+d2 = P+T+PT = 2P = d1d2

    Therefore, the estimated treatment effect, T, is (d1 − d2)/2.

     
    Means and differences
    EffectPeriod
     1difference2
    Group11,1+P +T1,2
    difference+T T
    22,1+P T2,2

    The mean difference for period 1 to period 2 for group 1 (d1) is 22
    The mean difference for period 1 to period 2 for group 2 (d2) is −13.5
    An F-ratio test shows that variances can be assumed equal, so we assess d1−d2 with the equal variance t-test. This gives t = 12.48 with df = 6 for which P = 0.000016. This indicates a highly significant treatment effect with a mean difference between treatments of (d1 − d2)/2 = 17.8 units.

    The 95% normal approximation confidence interval around the treatment effect is obtained as follows:

    95% CI = 17.8 0.5 2.447 2.843 = 14.3 to 21.3

    We checked for a period effect, P, by assessing (d1d2)/2 using an equal variance t-test.
    In this case t = 2.99, df=6, P = 0.024. Hence we have a significant period effect.

     

     

  4. The weighted t-test

    Worked example

    Malaria prevalence in
    intervention and control villages
    Village Intervention
    group (1)
    Control
    group (2)
    No. Prev (%) No. Prev (%)
    1
    2
    3
    4
    5

    w
    38
    65
    32
    75
    122
    31
    35
    41
    34
    22
    32.6
    30.1
    27
    21
    94
    31
    102
    45
    31
    51
    41
    61
    45.8
    51.5

    Trials of vector control methods are often carried out using cluster randomization because the effects of treatment tend to act at the group rather than individual level. We will take a hypothetical example of a trial of insecticide impregnated bed nets targeted at Anopheles mosquitoes for control of malaria. Ten villages are included in the trial. Five are allocated at random to receive a bed net intervention. The outcome variable is the prevalence of malaria parasites in children aged 1-5 years. There are a different number of eligible children in each village so a weighted analysis is required.

    We first estimate the weighted mean prevalence of each treatment group:

    w (1) = [3831 + 6535....+ 12222] / [38 + 65 +..... + 122] = 30.1
    w (2) = [2745 + 2131....+ 10261] / [27 + 21 +..... + 102] = 51.5

    The weighted means are markedly different from the unweighted means, with a larger apparent treatment effect. This is mainly because the two largest villages in the control group had higher prevalences than the others. Note that the weighted mean prevalences can also be obtained by dividing the total number of infections in each group by the total number of children.

    Using
    We then calculate the weighted variance of each treatment group:

      (38312 ....+ 122222)    −   5 30.12
    s2w (1) =   
    66.4
    5 − 1
               =    56.053

      (27452 ....+ 102612)    −   5 51.52
    s2w (2) =   
    55.0
    5 − 1
               =    98.338

    Since the F-ratio for these two variances is not significant, we can use the equal variance t-test. This gives a t-value of 3.851 with 8 degrees of freedom (P = 0.005). The mean difference between prevalences is 21.4% (95% CI 8.6 to 34.2).

    If we had ignored the different sample sizes we would still have found no significant difference between the variances, and proceeded with an unweighted equal variance t-test. This would have given a t-value of 2.241 for which P = 0.055, in other words not quite significant at the conventional P = 0.05 level.