We read with interest the article by He and associates on comparing the clinical outcomes of wavefront-guided and wavefront-optimized in situ keratomileusis (LASIK). They found both treatments were able to correct myopia safely and effectively; however, the wavefront-guided treatment platforms appeared to offer significant advantages in terms of residual refractive error, uncorrected distance acuity, and contrast sensitivity. The study was undoubtedly well designed and was a controlled clinical study, which could meet the expected high standard of phase III clinical trials regulated by the US Food and Drug Administration (FDA). However, our concern is that some medical researchers pay little attention or neglect to deal with the concept of multiplicity of hypothesis, which is rigorously applied between statisticians.
The authors defined 7 different primary outcome measures, including adverse event profiles at 4 different time points. In other words, they adopted a minimum of 28 primary endpoints. A P value <.05 was considered to be statistically significant, which is the conventional limit used in statistical analyses. According to the International Conference on Harmonization (ICH) and European Medicines Agency (EMEA), there should generally be only 1 primary endpoint that provides the most clinically relevant and convincing evidence that directly relates to the primary purpose of the clinical trial. If there is more than 1 primary endpoint, the effect on the type 1 error should be explained because of the potential for multiplicity problems, and a method used for adjusting for the type 1 error, such as Bonferroni correction, should be included. When 2 primary endpoints exist, a P value of <.025, and not <.05, should be considered as significant.
If we adopt this principle in this article, the number of significant results would dramatically decrease because only a P value of <.001786 is to be considered significant by applying Bonferroni correction to solve the multiple testing problems attributable to the dependency among 28 hypotheses. We can suggest 2 strategies to overcome this issue, as follows. The authors may choose only 1 primary outcome that represents the most important result of the study, and include the remaining variables in the secondary or tertiary endpoint, thereby using a P value of <.05 as the significance value for all statistical analyses. In this case, the result of the primary outcome is considered as confirmatory and those of the remainder as exploratory. The other approach is to clarify that the overall significance level was not controlled, and all significant endpoints ( P < .05) were not confirmatory, but exploratory. The article by Stacey and associates explained lucidly about the concept of multiplicity of hypothesis testing and analyzed the degree of its correct usages in ophthalmology research.