Although it is generally accepted that randomized controlled trials are the best source of evidence on treatment efficacy, the importance of taking into account the findings of previous research when designing and reporting trials is less well appreciated. Ignoring the results of other trials can lead to harm for patients and can waste resources. For example, 33 trials of intravenous streptokinase as thrombolytic therapy for acute myocardial infection were conducted between 1959 and 1988. A cumulative metaanalysis by Lau and associates showed that a consistent statistically significant beneficial effect on mortality would have been evident if a systematic review and metaanalysis had been carried out after the eighth trial was completed. Unfortunately, the review was not performed at that stage and a further 25 trials, involving more than 34 000 patients, were undertaken subsequently. Systematic reviews also may highlight gaps in the evidence, for example, where published trials do not provide the evidence needed to guide clinical practice.
The analysis by Lau and associates was published 15 years ago, but still very few published reports of trials refer to systematic reviews, either to justify conducting a trial or when discussing the results. In 2005, Chalmers pointed out that “academia as a whole has still not grasped that it is unscientific and unethical to embark on new research without first analysing systematically what can be learned from existing research.” Indeed, only relatively recently have some biomedical journals required that reports of clinical trials provide a summary of previous research findings and an explanation of how the reported trial affects this summary, “using direct reference to existing systematic reviews and meta-analysis.” Regularly updated systematic reviews are increasingly available. For example, there are 3826 reviews on The Cochrane Library ( http://www.cochrane.org/reviews/clibintro.htm ; accessed June 2, 2009), including 73 reviews on interventions for eye diseases ( www.cochraneeyes.org ).
The strength of randomized trials is that random allocation ensures that the treatment groups are comparable. Good-quality trials will also mask (or blind) participants and clinicians, thus ensuring that recruitment into the trial, and assessment of outcome, are unbiased. However, the results of any one trial may be wrong. It may not be large enough to detect the true effect, or there may be no true effect and the (statistically significant) result has arisen by chance alone.
Systematic reviews aim to summarize findings from individual studies using systematic methods to appraise critically the quality of the studies and to select those appropriate for the summary to avoid bias and to increase the likelihood that the summarized evidence is close to the truth. The term metaanalysis refers to the combined analysis of the results of the studies included in the review to produce a pooled estimate of treatment effect. The key point is that the original randomization must be preserved when pooling the data; it is not correct simply to treat all the different trials as one large study. A weighted average of the results of the individual trials is calculated, with larger trials being given more weight. The result of the metaanalysis depends on the methods used to perform the systematic review. Metaanalysis of a biased sample of poor-quality trials, for example, may be precise but will be wrong. Combining the results of multiple studies increases the power of the analysis and gives a more precise measure of effect. However, if there is considerable variation in the results of individual studies (heterogeneity), it may not be appropriate to report a pooled estimate of treatment effect. Heterogeneity arises because of differences in patient populations, interventions, or study design and should be investigated.
In essence, a systematic review is an observational study of the trials addressing a particular clinical or scientific question. As for primary studies, it is important to set out the methods to be used in advance. A detailed protocol should give information on the procedures for identifying relevant trials, inclusion and exclusion criteria, and the methods for assessing the quality of included trials. The protocol should set out a priori which outcomes are to be analyzed and how they are to be analyzed. The choice of outcome measures, and length of follow-up, should be selected on the basis of relevance to patients, rather than simply summarizing all the outcomes reported in the individual trials. Consumer or patient groups can be involved in the choice of outcomes at the protocol stage. The implicit purpose behind having a detailed protocol is that the conduct of the review should be as objective as possible, especially because it is likely that the reviewers will be aware of the results of some of the published trials.
There are several well-recognized potential biases that may affect systematic reviews. Trials that have statistically significant results are more likely to be published, and within published trials, statistically significant outcomes are more likely to be reported. If studies showing no effect for a particular outcome are excluded selectively from the metaanalysis, the overall effect of treatment will be exaggerated. Currently these publication and outcome reporting biases are difficult to avoid because it is not always possible to identify and obtain access to unpublished data. Current efforts to ensure that all trials are registered at inception and better access to trial protocols will improve future reviews. However, reviewers should try to identify unpublished studies and to contact investigators directly for data on outcomes that are not reported adequately.
The results of systematic reviews are dependent on the quality of the trials included in the review. There have been many attempts to grade study quality. In general, domain-based grading scales are clearer and more transparent than summary scores. The Cochrane Collaboration’s tool for assessing the risk of bias considers separately allocation of treatment, masking, incomplete outcome data, and selective outcome reporting. Structured assessment of individual study quality can feed into an overall assessment of the quality of the evidence for each outcome. Again, there are many published schemes for grading quality of evidence, but one particularly useful approach is that offered by GRADE (The Grading of Recommendations Assessment, Development and Evaluation, http://www.gradeworkinggroup.org/ , accessed September 3 rd 2009; Table ). The scheme takes into account study quality and publication bias as well as the consistency, precision, and directness of the evidence. This provides a transparent and systematic approach to assessing the overall quality of the evidence, and thus facilitates evidence-based conclusions regarding the effects of treatment.