When asked the advantages of a case-control design, students frequently respond that they are “cheap and easy.” This assertion typifies the view of the inferiority of case-control studies, yet is puzzling in the light of their major contribution to identifying important risk factors for disease for more than 50 years.
Case-control studies were the earliest type of epidemiologic study formally to use comparison. In 1926, a study of breast cancer, acknowledged as the earliest case-control study, reported being single and having few births as risk factors, a finding since replicated in numerous studies. Under the Third Reich, 2 case-control studies reported an association between lung cancer and smoking, but it was not until 1950, with the publication of 4 case-control studies in the United Kingdom and the United States, that the adverse effects of smoking and cancer reached a wider, although initially skeptical (of both the association and the method), medical audience. Subsequent decades saw refinements to case-control design and analysis (including the odds ratio) and a widening application across diseases and questions from evaluation of screening to outbreak investigation.
The use of case-control designs in ophthalmology was relatively slow to develop. Although in 1941 Gregg described the increased prevalence of rubella in congenital cataracts compared with a control series, it was not until the mid 1970s and 1980s that case-control designs started to be adopted. Nonetheless, the case-control design remains underused in ophthalmology, perhaps because of perceived methodologic difficulties.
Dirty Hands and Clean Minds
Geoffrey Rose, Professor of Epidemiology at the London School of Hygiene & Tropical Medicine, used this simple maxim to teach students. Epidemiologists must be aware of the biases inherent in observational studies and must assess these carefully when interpreting results. Aspects of case-control studies that give the most concern are the choice of controls and the reliability of participants’ memories (recall of exposure before disease onset). The control group should represent the population from which the cases derive. Hospital controls frequently have been used for hospital cases. They are easy to identify and often are happy to participate. The drawback is that hospital controls may be biased with respect to the exposure. For example, hospital patients are more likely to be smokers than the general population. Awareness of the potential biases in hospital controls led researchers to prefer population controls, exclusively or in addition to hospital controls. Population controls also may be unrepresentative if participation rates are low and factors associated with participation are biased with respect to the exposure. Other biased control groups are spouses of cases or volunteers; spouses in particular may share the same lifestyle. A preferable design for common conditions is a population-based study with cases and controls selected from the same population.
Despite these concerns, results from studies using hospital controls may not necessarily be biased compared with population controls. Meta-analyses of age-related macular degeneration (AMD) and smoking or iris color and uveal melanoma report no differences in results according to whether the controls were hospital or population based. In the smoking and AMD metaanalysis, the results were comparable also between 2 types of study design: for prospective studies, the risk estimate was 1.61 (95% confidence interval, 1.01 to 2.57), and for case-control studies, the risk estimate was 1.76 (95% confidence interval, 1.56 to 1.99).
Errors in the reporting of exposures are inevitable in all epidemiology study designs. If reporting errors occur in a similar proportion of both cases and controls, the odds ratios will usually, but not invariably, be biased toward the null. Recall bias occurs when reporting errors are more likely in cases than controls or vice versa. Although researchers worry about the potential for recall bias, it is difficult to establish whether recall bias has occurred and, if it has occurred, the direction of the bias on the results. The few studies to investigate this found that recall bias varied according to the exposure, the disease, and the level of awareness of the risk factor in the study population. An elegant study investigating responses on sunlight exposure before and after a diagnosis of melanoma found recall bias in some but not all the exposures. The estimates of risk for the retrospective or prospective exposures were very similar, apart from solarium use.
Readers need to be aware of the potential problems in case-control study design. However, readers should not be so overwhelmed by these that they assume that all case-control studies are flawed to such an extent that the results are not valid. Confidence in case-control studies is provided by metaanalyses that show similar results for different types of study design, for example, fish intake and AMD.
Why Use a Case-Control Design?
Case-control studies remain the design of choice for rare diseases or outcomes. In prospective studies, obtaining adequate numbers of cases may take many years of follow-up and even then may be underpowered. Although often regarded as a common disease, AMD has a low prevalence, less than 2% in people 55 years of age and older. It is not until the eighth decade of life that the prevalence rises to approximately 10%. Even in a well-designed prospective study (such as The Beaver Dam Eye Study), there were only 102 cases of incident AMD after 15 years of follow-up. Selective survival over follow-up also distorts associations, for example, between smoking and AMD. The long interval between baseline and results inevitably limits the hypotheses investigated in prospective studies. In contrast, case-control studies can investigate novel hypotheses in a short time scale. Such hypotheses arise in a variety of ways: advances in other fields, clinical observation, or data from routine sources such as hospital or mortality statistics. Comparisons over time and place (ecological studies) often provide the first impetus for case-control studies. For example, in the mid 1970s cancer registration rates for ocular melanoma were reported as lower in blacks compared with whites. This observation, along with emerging evidence for cutaneous melanoma, identified sunlight exposure, skin type, and iris color as possible risk factors for ocular melanoma and led to case-control studies designed to investigate these hypotheses. The outstanding discovery of the Y402 H allele and AMD, using case-control studies, had its origins in one of the earliest case-control studies that reported a strong association with parental history and AMD. Case-control studies traditionally are regarded as underpowered to investigate rare exposures. With the increasing availability of large data bases, such as family practitioner records in the United Kingdom, case-control studies investigating uncommon exposures may be undertaken. For genome-wide analysis, 2000 cases and a similar number of controls are required to detect associations with moderate genetic effects.