![]() Biology, images, analysis, design... |
|
"It has long been an axiom of mine that the little things are infinitely the most important" |
|
Measures of relationship between variables: Use and misuse(Risk ratio, odds ratio, rate ratio, scatterplots, correlation, regression)Statistics courses, especially for biologists, assume formulae = understanding and teach how to do ![]() ![]() Use and MisuseWe first consider the use of summary measures to describe association between nominal variables. Odds and risk ratios are very heavily used in medical and veterinary applications, although less so by other applied biologists. Statisticians tend to differ in their attitudes to odds ratios - some consider them the best thing since sliced bread, others that they should only be used if there is absolutely no alternative. For case control studies![]() ![]() We give When it comes to measurement variables, scatterplots are widely used to display the association between measurement variables. Further analysis is done using regression or correlation. These make certain assumptions about the distribution of each variable, as well as assuming that the relationship between them is truly linear. We give several examples of where these assumptions are not met, and where the analyses have been applied to non-linear relationships. Examples of 'influential' points in scatterplots abound in the literature, arguably more so than outliers, and we have several examples of this. Other common problems are extending regression lines beyond the limits of the observations, and only giving the regression line without the data points. Correlation and regression are sometimes wrongly used to assess agreement between variables rather than association - although the example we give does this correctly using a line of equality. Irrespective of the measure of association, the commonest misuse in analysing relationships is to assume that a close association between two variables proves that changes in one variable cause changes in the other. Unfortunately association alone can never prove causation because there are many ways in which a spurious association can arise, including simple random selection. For each example we look at whether sufficient consideration has been given to the possibility of bias (often caused by non-random sampling) or the presence of confounding variables. The converse is to assume no relationship just because it is not significant - this may be because there really is no relationship, but it may equally well be because the sample size is too small. We also give examples where it is not clear whether X is causing Y, or vice versa. What the statisticians sayWoodward (1999)![]() ![]() ![]() ![]() ![]() ![]() Sistrom & Garvan (2004) Kuo (2002) Wikipedia provides sections on response and explanatory variables
|