Biology, images, analysis, design...
Use/Abuse Principles How To Related
"It has long been an axiom of mine that the little things are infinitely the most important" (Sherlock Holmes)




Conditional Logistic Regression

Conditional logistic regression is appropriate for (individually) matched case-control data. It is usually not appropriate for frequency matched case control data, which should be analyzed using ordinary logistic analysis with stratum as a covariate.

The conditional logistic regression model can then be specified as below:

Algebraically speaking -

logit (p)   =   β1X1 + β2X2 + βkXk + αstratum(i)
  • p is the probability of being a case)
  • β1X1 to βkXk are the regression coefficients that represent log odds; they are more interpretable in exponent form (exp β or eβ) which converts them to odds ratios.
  • α1 to αs are stratum constants

Note we now have stratum effects (one for each case-control pair or matched set) but no intercept (β0). We are also making an additional assumption to those required for logistic regression - namely that the odds ratio for each explanatory variable is the same in all strata.

The usual unconditional maximum likelihood estimation methods should not (and often cannot) be used here as there are too many parameters - one for each stratum. For the commonest situation (1:1 matching) with a single explanatory variable unconditional analysis of matched pair data result in than estimate of the odds ratio which is the square of the correct conditional one, so an odds ratio of 2 will be reported as 4. This bias is greatest for matched pairs but persists with up to 10 cases and 10 controls in each set. (see Breslow & Day)

If however we use conditional maximum likelihood estimation we can obtain the odds ratios by eliminating the nuisance stratum parameters.