Test–Retest Reliability of Health-Related Quality-of-Life Questionnaires in Adults with Strabismus




Purpose


To report the test–retest variability of two health-related quality-of-life instruments: the new Adult Strabismus 20 (AS-20) and the National Eye Institute 25-item Visual Function Questionnaire (NEI VFQ-25), in adults with strabismus.


Design


Prospective case series.


Methods


Fifty-five adult patients in a clinical practice with stable strabismus completed the AS-20 and the NEI VFQ-25 at 2 visits, without intervening treatment. Questionnaires were completed the second time either at a subsequent office visit, immediately before surgery, or by mail. Intraclass correlation coefficients were calculated. Ninety-five percent limits of agreement and 95% confidence intervals around the 95% limits of agreement also were calculated.


Results


There was excellent agreement of overall questionnaire scores for the AS-20 (intraclass correlation coefficient, 0.92) and NEI VFQ-25 (intraclass correlation coefficient, 0.94). The 95% limits of agreement for overall scores were 14.3 points (95% confidence interval, 10.9 to 17.7) for the AS-20 and 11.1 points (95% confidence interval, 8.5 to 13.8) for the NEI VFQ-25. The lower test–retest variability of the VFQ-25 seemed to be partly the result of ceiling effects with many scores at the normal end of the range.


Conclusions


The new AS-20 and the NEI VFQ-25 show excellent test–retest reliability in adults with strabismus. Change exceeding 95% limits of agreement (14 points on the AS-20 and 11 points on the VFQ-25) is indicative of real change in an individual patient. The AS-20 may be more useful than the VFQ-25 because it is less prone to ceiling effects in adults with strabismus.


Formal assessment of health-related quality of life (HRQOL) has been recommended in the management of adult strabismus. Several vision-specific HRQOL instruments have been used in the evaluation of adults with strabismus, but reports on the test–retest reliability of these instruments are sparse. Reliability data provide an estimate of variance caused by random error or measurement error. Test–retest reliability data describe the extent to which repetition of the test yields the same results when no underlying change in health has occurred. Limits of agreement calculated from test–retest data are particularly helpful for interpreting changes in scores over time for an individual patient and have been used in other fields, such as prism and cover test measurements in strabismus. In previous reports, we described the development and initial validation of the Adult Strabismus 20 (AS-20), a strabismus-specific HRQOL questionnaire for adults. In the present study, we report the test–retest reliability of the AS-20 questionnaire in a cohort of adults with strabismus. For comparison, we also assessed the test–retest reliability of the National Eye Institute 25-item Visual Function Questionnaire (NEI VFQ-25) in the same cohort of adult strabismus patients.


Methods


Fifty-five adult strabismus patients (median age, 44 years; range, 18 to 80 years) were recruited prospectively from outpatient clinics and completed both the AS-20 and the NEI VFQ-25 at 2 time points within 1 year. Questionnaires were completed in the office for the first administration. The second administration was either (1) in the office at a subsequent examination (n = 29; 25 to 144 days later; median, 66 days), (2) immediately before surgery (1 day later in 17 patients, 6 days later in 1 patient), or (3) by mail within 2 days (n = 8). Patients completing the questionnaires for the second time immediately before surgery or by mail were instructed to complete the questionnaires as if they had not completed them before. Patients with inherently variable strabismic conditions (e.g., ocular myasthenia gravis) were excluded to limit the study cohort to patients who were stable between questionnaire administrations. We also excluded patients who had undergone strabismus surgery within 1 year before the first examination because patients’ symptoms and perceptions might have changed during the postoperative period. For the office retest administration, patients were required to have stable strabismus (no change in angle of deviation of more than 10 PD in primary position) and no intervening treatment or change in treatment. Thirty-eight (69%) were female and 49 (89%) of patients self-reported their race as white. Strabismus diagnoses were idiopathic in 32 (58%) patients, neurologic in 19 (35%) patients, and mechanical in 4 (7%) patients. Of our patients, 35 (64%) had diplopia, 9 (16%) had rare diplopia, and 11 (20%) did not have diplopia. Visual acuity ranged from 20/15 to 20/30 in the better eye (median, 20/20) and 20/15 to 20/4000 in the worse eye (median, 20/20).


Responses for each item on the AS-20 were recorded using a 5-point Likert-type scale (never, rarely, sometimes, often, and always), and converted, for each patient, to a mean score ranging from 0 (worst HRQOL) to 100 (best HRQOL). The NEI VFQ-25 contains Likert-type scales and also yields a mean individual patient score from 0 to 100. An administrable version of the AS-20 is available online at http://public.pedig.jaeb.org (accessed July 31, 2009) and of the NEI VFQ-25 at http://www.nei.nih.gov/resources/visionfunction/vfq_ia.pdf (accessed July 31, 2009).


Statistical Analysis


For both the AS-20 and the NEI VFQ-25, differences in scores at first and second administrations were compared using signed-rank tests. Bland-Altman plots were used to analyze the variability of the differences. Half widths of the 95% limits of agreement were calculated using 1.96 standard deviation to define the limits within which 95% of the differences should lie. The 95% confidence intervals (CIs) around the 95% limits of agreement also were calculated. Intraclass correlation coefficients were calculated between first and second administrations. Analyses were repeated to compare variability in patients with and without diplopia.




Results


Differences Between First and Second Administrations


As expected in a test–retest study of a reliable instrument, there were no significant differences between overall and subscale scores on the AS-20 ( P > .5 for all comparisons; Table ). Nevertheless, for the NEI VFQ-25, scores were very slightly higher on the second administration (better HRQOL) for the overall score (75.8 versus 77.5; P = .02) and for the difficulties with near activities subscale (71.5 versus 75.6; P = .01). There were no other significant differences found in NEI VFQ-25 subscale scores between first and second questionnaire administrations.



TABLE

Adult Strabismus 20 Questionnaire and National Eye Institute 25-Item Visual Function Questionnaire Test–Retest Mean Scores ± Standard Deviation Scores
























































































































































































Questionnaires and Subscales No. Test Retest Difference P Value a 95% LOA (95% CI) ICC (95% CI)
AS-20
Overall 55 58.9 ± 18.5 59.5 ± 17.8 0.6 ± 7.3 .5 14.3 (10.9 to 17.7) 0.92 (0.87 to 0.95)
Functional scale 55 52.2 ± 22.5 52.5 ± 22.2 0.3 ± 9.9 .9 19.5 (14.9 to 24.1) 0.90 (0.84 to 0.94)
Psychosocial scale 55 65.6 ± 24.9 66.4 ± 25.5 0.8 ± 9.0 .7 17.7 (13.5 to 21.9) 0.94 (0.89 to 0.96)
VFQ-25
Overall 55 75.8 ± 16.8 77.5 ± 16.0 1.7 ± 5.7 .02 11.1 (8.5 to 13.8) 0.94 (0.89 to 0.96)
General health 55 68.2 ± 26.1 69.5 ± 23.4 1.4 ± 11.2 .5 22.0 (16.8 to 27.1) 0.89 (0.83 to 0.94)
General vision 55 70.2 ± 15.8 69.1 ± 17.6 −1.1 ± 16.1 .7 31.5 (24.1 to 38.9) 0.54 (0.33 to 0.70)
Ocular pain 55 74.5 ± 23.2 78.2 ± 21.5 3.6 ± 18.4 .2 36.1 (27.6 to 44.6) 0.66 (0.48 to 0.78)
Near activities 54 b 71.5 ± 23.6 75.6 ± 21.4 4.2 ± 11.5 .01 22.6 (17.2 to 28.0) 0.85 (0.76 to 0.91)
Distance activities 55 76.1 ± 22.3 78.3 ± 21.3 2.3 ± 10.2 .1 20.0 (15.3 to 24.7) 0.89 (0.81 to 0.93)
Vision specific
Social functioning 55 86.4 ± 16.9 89.3 ± 15.9 3.0 ± 11.0 .05 21.6 (16.5 to 26.7) 0.76 (0.63 to 0.85)
Mental health 55 59.8 ± 29.8 64.0 ± 28.4 4.2 ± 13.6 .08 26.7 (20.4 to 33.0) 0.88 (0.81 to 0.93)
Role difficulties 54 b 65.7 ± 31.1 66.9 ± 28.1 1.2 ± 14.6 .4 28.7 (21.8 to 35.5) 0.88 (0.80 to 0.93)
Dependency 52 c 84.1 ± 23.2 84.5 ± 24.6 0.3 ± 9.0 .6 17.7 (13.4 to 22.0) 0.93 (0.88 to 0.96)
Driving 52 c 77.1 ± 22.7 76.4 ± 20.5 −0.6 ± 10.7 .9 20.9 (15.8 to 26.0) 0.89 (0.82 to 0.94)
Color vision 55 98.2 ± 6.6 98.6 ± 5.7 0.5 ± 5.9 1.0 11.5 (8.8 to 14.2) 0.55 (0.34 to 0.71)
Peripheral vision 55 71.4 ± 26.1 73.6 ± 25.2 2.3 ± 16.9 .3 33.0 (25.2 to 40.8) 0.78 (0.66 to 0.87)

AS-20 = Adult Strabismus 20 questionnaire; 95% CI = 95% confidence interval around the 95% LOA half width; 95% LOA = half-width of the 95% limits of agreement; ICC = intraclass correlation coefficient; VFQ-25 = National Eye Institute 25-item Visual Function Questionnaire.

a P value based on nonparametric paired comparison (signed rank).


b Data missing in 1 case.


c Data missing in 3 cases.



Differences Between Methods of Administration


Analyzed separately by method of administration, the intraclass correlation coefficient for the AS-20 was slightly lower (indicating more variability between measures) for the office and presurgery administrations than for the mail administration (0.91; 95% CI, 0.82 to 0.96; vs 0.90; 95% CI, 0.76 to 0.96; vs 0.93; 95% CI, 0.71 to 0.98). For the NEI VFQ-25, the intraclass correlation coefficient was numerically lower, but not significantly lower, for the office than for the presurgery or mail administrations (0.92; 95% CI, 0.84 to 0.96; vs 0.95; 95% CI, 0.87 to 0.98; vs 0.94; 95% CI, 0.75 to 0.99).


For our estimates of the 95% limits of agreement, we found a similar pattern for the AS-20, where the estimates from retests obtained by office and before surgery were slightly higher (indicating more variability between measures) than by mail (15.2; 95% CI, 10.2 to 20.3; vs 14.5; 95% CI, 8.2 to 20.8; vs 10.4; 95% CI, 2.8 to 18.0). For the NEI VFQ-25, the 95% limits of agreement also were slightly higher for the office and presurgery administrations than by mail (12.9; 95% CI, 8.6 to 17.1; vs 10.3; 95% CI, 5.8 to 14.8; vs 5.5; 95% CI, 1.5 to 9.5). Because the estimates of different methods of administration were similar and the 95% CIs of our estimates included the point estimates of the other methods of administration, we combined the data for subsequent analyses.


Overall Intraclass Correlations


Using a published scale, agreement between examinations, as measured by the intraclass correlation coefficient, was almost perfect (> 0.80) for both the AS-20 (0.92; 95% CI, 0.87 to 0.95; Table ) and the NEI VFQ-25 (0.94; 95% CI, 0.89 to 0.96). Agreement also was almost perfect between questionnaire administrations for both AS-20 subscales. For the NEI VFQ-25 subscales, agreement was almost perfect on 7 of the 12 subscales, substantial (> 0.6 to 0.80) in 3 of 12 subscales, and moderate (> 0.4 to 0.6) in 2 of 12 subscales ( Table ).


Overall Distribution of Test–Retest Differences


Test–retest differences are plotted against mean scores, as described by Bland and Altman, in the Figure . Across AS-20 scores and across NEI VFQ-25 scores within each instrument, variability did not seem to depend on severity ( Figure ). Nevertheless, the NEI VFQ-25 scores were clustered toward the normal end of the range in these adults with strabismus, suggesting a possible ceiling effect. Comparing the first with the second administration, neither the AS-20 nor the NEI VFQ-25 demonstrated any significant regression to the mean (data not shown).


Jan 17, 2017 | Posted by in OPHTHALMOLOGY | Comments Off on Test–Retest Reliability of Health-Related Quality-of-Life Questionnaires in Adults with Strabismus

Full access? Get Clinical Tree

Get Clinical Tree app for offline access