Biology, images, analysis, design...
|"It has long been an axiom of mine that the little things are infinitely the most important" |
Cohort designs: Use & misuse
(fixed cohorts, dynamic cohorts, prospective, retrospective, exposure misclassification, observer bias)
Statistics courses, especially for biologists, assume formulae = understanding and teach how to do statistics, but largely ignore what those procedures assume, and how their results mislead when those assumptions are unreasonable. The resulting misuse is, shall we say, predictable...
Use and MisuseThe cohort design is a prospective or (less commonly) retrospective observational design in which the groups (cohorts) are determined by the level of the explanatory variable. Cohorts may be fixed (every individual in a cohort starts at the same time and is followed up for a similar period of time) or dynamic (individuals recruited to or leave the cohort at different times). Individuals within cohorts are followed up over time, usually to determine the incidence of the condition under study.
Cohort studies are widely used over a range of disciplines, most heavily in medical and veterinary epidemiology. The unit of study is usually the individual person or animal, although in wildlife studies other entities (such as the nest) can be an appropriate unit. Follow up is usually straightforward in human and veterinary studies, but is only possible in wildlife studies using radio telemetry or mark-release-recapture. The cohort design is sometimes confused with other designs, in particular (in medical research) with the case-control design. In addition there is sometimes no control (or comparison) group, even when such a group is sorely needed.
Since there is no random allocation to 'treatment', the risk of confounding by other factors is present in all studies. Hence data should always be gathered on all (known) possible confounding factors that might affect outcome so that estimates can be corrected at the analysis stage. We give several examples where this is done - and rather more where it is not. The problem is especially serious if two independent groups are selected without ensuring that the groups only differ with respect to the study variable. Surprisingly some studies could be regarded as experiments if the researcher had bothered to randomly allocate to treatment.
Exposure misclassification is common through measurement error and bias. This can result from using too broad a definition of exposure, which dilutes out the effect - a study on Gulf War syndrome is a good example of this. This is especially common with retrospective studies, where one is reliant upon inadequate written information to assess exposure. Recall bias is a serious problem if one is dependent on interviewing survivors. Misclassification of outcome can occur if for example disappearances are equated to mortality, or if there is observer bias. Whenever possible those recording outcome measures should be blinded to exposure - we give several examples where the lack of blinding may have biased the measurement of highly subjective variables.
Sample size is often a problem, especially with wildlife studies where telemetry was used. Transmitters are expensive and there is always a temptation to reduce the number of animals to reduce cost. A high loss to follow up also reduces the sample size, but in addition can introduce bias if losses are not 'missing at random'. With mark-recapture one obviously get a much larger loss to follow-up, and one has to assume that recaptured individuals are typical of the whole population.
What the statisticians sayWoodward (2004) and Rothman & Greenland (1997) both provide detailed accounts of the cohort design for medical epidemiologists. Armitage & Berry (2002) and Streiner & Norman (1998) provide briefer descriptions of the design. Thrusfield (2005) gives a tabular comparison of several observational designs for veterinarians, including the cohort design. More extensive coverage is given by Dohoo et al. (2003). Wobeser (2008) covers cohort studies on wildlife in Chapter 6. Priede & Swift (eds.) (1992) provide extensive coverage of wildlife telemetry including a section on design and analysis of cohort studies. White & Garrott (1990) look at the analysis of wildlife radio-tracking data.
Lu (2009) reviews observational designs and strategies to reduce confounding. Vandenbroucke et al. (2008) compares observational research and randomised trials in medical research. von Elm (2007) provide guidelines for strengthening the reporting of cohort, case-control and cross-sectional) studies in epidemiology. Blackmore & Cummings (2007) describe the use of cohort and case control designs in radiology research. Euser et al. (2009) look at the strengths and weaknesses of prospective versus retrospective cohort studies. In a series of three articles, Gurwitz et al. (2005), Mamdani et al. (2005) and Normand et al. (2005) look at the design of cohort studies, assess the potential for confounding and review analytical strategies. Grimes & Schulz (2002) provide a review of cohort designs, along with an excellent section on 'what to look for in a cohort study'. Szklo (1998) reviews population-based cohort studies Cummings et al. (2003), and Cummings & McKnight (2004) describe the use and analysis of matched cohort methods in injury research. Costanza (1995), Greenland & Morgenstern (1990), Kupper et al. (1981) and Rubin (1973) discuss matching and efficiency in cohort studies.
Zou (2004) proposes using a Poisson regression approach with robust error variance to analyze prospective studies with binary data. Robbins et al. (2002) argues that for cohort studies, the use of logistic regression should be sharply curtailed, and instead propose using a generalized linear model with a binomial distribution and a log link. A biased estimator to estimate the risk ratio from the odds ratio given by logistic regression in a cohort study is rediscovered by McNutt et al. (2003) and by Zhang (1998) . The correct method is given by Greenland & Holland (1991).