## Purpose

To investigate the use of analytic approaches for eye-specific outcomes in ophthalmology publications.

## Design

A review of analytic approaches used in original research articles published in ophthalmology journals.

## Methods

All 161 research articles published in 5 ophthalmology journals in the first 2 months of 2008 were considered. Publications were categorized according to analytic approach: 1 eye selected, both eyes contribute, or per-individual outcome. Studies were considered suboptimal when criteria for eye selection were not provided or when measurements from both eyes were included without interocular correlation being considered. Visual impairment prevalence data were used to illustrate analytic approach choices.

## Results

Measurements from both eyes were included in 38% of the 112 studies that used statistical inferential techniques. In 31 (74%), there was no mention of possible correlation. Only 7% used statistical methods appropriate for correlated outcomes. In 35 studies (31%), measurements from 1 eye were selected; 31% of these did not provide selection criteria. In 67%, only univariate tests were used. A review of 47 articles published in 2011 produced similar findings. Characteristics of studies were not found to differ according whether the studies were suboptimal. Using a test appropriate for correlated outcomes resulted in a *P *value 3.5 times that obtained ignoring the correlation.

## Conclusions

Between-eye correlation seems not to be assessed commonly in ophthalmology publications, although its knowledge aids the choice of analytic approach when eye-specific variables are of interest. Statistical methods appropriate for correlated ocular outcome data are not being applied widely.

Measurements obtained from both eyes of an individual often are correlated, that is, measurements obtained in one eye are more likely to be similar to those of the other eye than to ocular measurements from an unrelated person. Standard statistical inferential techniques such as *t* tests, analyses of variance, confidence intervals, and linear regression are valid, however, only under the assumption that observations are independent. Simulation studies have demonstrated that including measurements from both eyes without adjusting for the correlated nature of the data may have a substantial effect on the results. Inclusion of measurements from fellow eyes without consideration of their possible correlation usually results in underestimated standard errors, and thus falsely small *P* values and falsely precise confidence intervals, with the magnitude of the problem increasing as the correlation increases. For many ocular variables, the correlation between fellow eyes has been reported to be high. Correlations of approximately 0.8 have been reported for intraocular pressure (IOP), cup-to-disc ratio, threshold sensitivity, and short-wave length automated perimetry parameter pattern standard deviation data.

Between-eye correlation sometimes can be exploited in a paired-eyes study design. In both the United States Diabetic Retinopathy Study and the Glaucoma Laser Trial, for example, one eye was randomized to receive treatment, whereas the fellow eye served as the control. Other analytic approaches include the use of measurements from only 1 eye (selected using defined criteria) or averaging measurements over the 2 eyes. Both of these approaches are statistically valid, but are likely to be inefficient, resulting in lower power and less precise estimates than when all available measurements are incorporated. The extent of the loss of statistical information when averaging is likely to be smaller than when only 1 eye is selected. With both approaches, the loss of information is greater when the correlation is low. Correlated ophthalmic data form a subset of clustered data, having maximum cluster size of 2. Family, school, periodontal, and otolaryngologic data also are examples of clustered data. Methods for the modelling of clustered data have been developed in recent years, initially in the field of social and educational statistics, but also in medicine. Both univariate and multivariate statistical methods that account for the correlations between fellow eyes have been developed over the past 3 decades.

A 1998 review of 79 *British Journal of Ophthalmology* publications indicated that many studies failed to use all the available data and that a substantial proportion used inappropriate statistical methods. To our knowledge, no review of analytic approaches used in ophthalmology journals has been published since then. The main aim of the present study was to summarize the approaches currently in use to account for correlated measurements between fellow eyes by reviewing publications in 5 ophthalmology journals. A secondary aim is to illustrate the application of 4 different valid analytic approaches for correlated binary outcome data.

The article selection process, the exclusion criteria, and a description of the analytic approaches are presented in the Methods section below, together with the exploratory statistical methods used. A univariate test appropriate for the comparison of correlated binary outcomes and estimation of the intraclass correlation coefficient are also described in this section. The Results section presents the results of the review and the comparison of characteristics according to whether the articles have a suboptimal design. The Results section also contains an illustration of 4 valid analytic approaches for the comparison of correlated binary outcomes (the presence of mild visual impairment) and a commonly used incorrect approach. In the Discussion section, the results are elucidated and discussed in the context of other studies.

## Methods

## Selection and Categorization of Articles

An ophthalmologist (M.T.) selected 2 general and 3 subspecialty ophthalmology journals from the top 50% of the list of 45 ophthalmology journals on the ISI Web of Knowledge (ranked according to their 2007 impact factor). Both general and subspecialty ophthalmology journals were selected to reflect the broad spectrum and the distribution of analytic approaches in these published ophthalmologic articles. The journals selected were *Acta Ophthalmologica *, the *American Journal of Ophthalmology *, the *Journal of Cataract & Refractive Surgery *, *Retina *, and the *Journal of Glaucoma. *All original articles published in the first 2 months of circulation in 2008 were reviewed. *Acta Ophthalmologica *had February as its first month of circulation.

A questionnaire consisting of 30 items was developed, tested, and completed for each published original article. Questions related to study design, study aims, numbers and types of variables considered, units of measurement and analysis, statistical methods used (univariate or multivariate), sample size, and the proportion of missing values were considered. Study design was ascertained according to the *American Journal of Ophthalmology *guidelines. The questionnaires were completed by a doctoral student (A.K.) and a postgraduate physician (N.H.E.) and were checked by a medical statistician (J.M.). All questions related to study design were reviewed by an epidemiologist (M.V.).

Articles in which there was no use of statistical inferential techniques were excluded from further analysis. Subsequently, articles were classified according to the type of study undertaken. Animal studies and laboratory experiments were excluded. Using a scheme similar to that presented by Murdoch and associates, the remaining studies were broadly categorized into the following groups:

- 1

Studies with outcomes measured at the ocular level in which both eyes are eligible (for at least some subjects), but measurements from 1 eye are chosen for inclusion in the statistical analysis, for example, right or left eye, random selection, dominant eye, better or worse eye, or the first eye with the condition.

- 2

Studies in which only 1 eye from each subject is eligible for inclusion, for example, the eye that was operated on or the single eye with the condition of interest (rare disease).

- 3

Studies in which some or all individuals contribute measurements on both eyes in the statistical analysis. This may be because a paired design is used at the ocular level, for example, eyes are randomized so that one receives local treatment and the other does not. However, it may be because information from both eyes is used within each treatment group, either with or without adjustment for the possible correlation.

- 4

Studies in which ocular outcomes are summarized per individual before analysis, resulting in statistical analysis at the subject level. For example, the average of the separate measurements in each eye is calculated or the results are pooled. This category includes investigations in which information from each eye separately is not of interest. For example, certain conditions are diagnosed at the subject level, but use measurements from both eyes.

It was expected that some studies would have both eye-specific and per-individual outcomes of interest (eg, best-corrected visual acuity). Any study in which at least 1 main outcome was eye specific was classed as belonging to group 1, 2, or 3 as appropriate.

Univariate tests were used to compare the characteristics of studies classed as being methodologically suboptimal with those of the other studies. Methodologically suboptimal studies were those that either (1) included measurements from both eyes without mention, or assessment, of possible interocular correlation, (2) did not provide the number of participants, or (3) did not describe the method used to select the eye chosen for inclusion in the study. Qualitative variables were compared using the chi-square test of independence or the Fisher exact test as appropriate. Sample sizes were compared using the nonparametric Mann–Whitney *U *test. A 5% significance level was chosen.

Subsequently, a review of 47 original articles published in the same 5 journals in February 2011 was undertaken (50% of the total number of articles published in each journal) using exactly the same procedure as for the 2008 publications. Differences in proportions (in 2011 and 2008) were compared using Z tests and Newcombe-Wilson hybrid score confidence intervals.

## A Hypothesis Test Appropriate for the Comparison of Proportions with Correlated Outcomes

As described in Fleiss and associates, the proportions of eyes with a characteristic can be compared between 2 samples accounting for the possible correlation, using an asymptotic approach with variance inflation factors applied to adjust the variance of the difference in proportions and to calculate an asymptotically normally distributed Z statistic.

The test statistic is:

with

where g

_{1 }is the number of eyes in group 1, g

_{2 }is the number of eyes in group 2, and the variance inflation factor for group 1 is f

_{1 }, where

where r

_{1 }is the intraclass correlation coefficient for group 1 and similarly for group 2.

s21

s 1 2

and s22

s 2 2

represent the variance of the cluster sizes in groups 1 and 2, respectively, and ˉn1

n ¯ 1

and n ¯ 2

n ¯ 2

represent the arithmetic mean cluster size. When all clusters are the same size (ie, all individuals contribute 2 eyes), s 1 2 = s 2 2 = 0

s 1 2 = s 2 2 = 0

and n ¯ 1 = n ¯ 2 = 2

n ¯ 1 = n ¯ 2 = 2

and the variance inflation factor simplifies to f 1 = 1 + r 1

f 1 = 1 + r 1

.

The test statistic Z has a standard normal distribution in large samples, under the null hypothesis H _{0 }:p _{1 }= p _{2 }. If the samples are small, the type I error rates are likely to be inflated.

In the case of binary outcomes, the intraclass correlation coefficient can be estimated using the formula:

r = ∑ i = 1 k { Y i + ( Y i + − 1 ) − 2 p ( n i − 1 ) Y i + + n i ( n i − 1 ) p 2 } ∑ i = 1 k n i ( n i − 1 ) p ( 1 − p )