Purpose
To validate a third-generation performance-based measure of visual function titled “Assessment of Disability Related to Vision” (ADREV) in a study population of patients with diabetic retinopathy.
Design
Prospective, cross-sectional study.
Methods
Patients with nonproliferative or proliferative diabetic retinopathy, free from ocular comorbidity, were recruited from a single institute and completed the ADREV, the 25-Item National Eye Institute Visual Functioning Questionnaire (VFQ-25), and a clinical ophthalmic examination. Correlation, regression, and bootstrap analysis were conducted to determine the relationship between ADREV scoring and each of the study’s clinical and self-report measures of visual ability, while controlling for potential confounders.
Results
Ninety-one patients with diabetic retinopathy completed the study and analysis showed that the ADREV total and subscale scores shared a stronger relationship with the clinical measures of visual function than did the VFQ total and subscale scores. Regression analysis revealed that binocular visual acuity, contrast sensitivity, and better eye visual field were the best predictors of ADREV performance.
Conclusions
The ADREV performance measure is a valid instrument for the assessment of disability related to vision in patients with diabetic retinopathy. Furthermore, the assessments provided by ADREV were more related to traditional clinical indicators of visual impairment than were the results of the self-report measure, specifically the VFQ-25.
Assessing visual health is a fundamental goal of ophthalmic practice. How one defines visual health has important implications for how it is measured and appraised. While the definition of health is subject to extensive debate, most individuals can agree that health involves the ability of an individual to perform common activities essential to modern life. Indeed, few would contest the idea that one lacks some measure of health if he or she is unable to perform essential activities of daily living such as walking, recognizing friendly faces, and finding dropped items. While these ideas seem self-evident, the assessment of one’s ability to perform activities of daily living is rarely used to assess visual health. Traditional assessments of vision typically involve 4 types of information: 1) symptoms acquired through a medical history; 2) signs obtained through physical examination; 3) laboratory data; and 4) anatomic data acquired through various imaging modalities. Importantly, the advancement of modern biotechnology has provided novel methods of technology-intensive evaluations in the form of more sensitive imaging and laboratory assessments of visual health, such as confocal scanning laser ophthalmoscopy, optical coherence tomography, and genetic analysis. However, it is evident that the results of these high-technology assessments are often unrelated to one’s ability to carry out daily activities. The disjunction between the current trend of high-technology assessment and the ability to function at a practical level is representative of the often encountered discrepancy between what is important to clinicians and what is important to patients.
Over the last decade the discipline of visual health sciences has seen the growing use of standardized medical histories in the form of quality-of-life (QoL) questionnaires to evaluate the health of the visual system. This trend can be viewed, at least in part, as a means with which to address the discordance between high-technology disease assessment and its associated lack of focus on what is practically important to patients. While the growing use of vision-specific QoL surveys has provided important information about the impact of visual disease from a patient’s perspective, this modality of evaluation comes with its own range of limitations. Health-related QoL surveys, including vision-specific questionnaires, are influenced by a very broad range of factors including patient personality, individual preferences, personal biases, mental health, desire to mislead, and desire to please. In an effort to develop new methods of assessing the visual health of patients, several investigators have developed and tested standardized protocols that evaluate the ability of individuals to perform visually intensive daily activities. Although the content of these different investigative techniques tends to vary in the specific activities tested, they all have a common focus on detecting changes in very basic and practical visual abilities required to perform daily activities. The results of these studies have demonstrated that performance-based measures (PBMs) of visual function are valid and reliable measures of visual health. In the same tradition of research, the investigation presented here is intended to validate a third-generation PBM of visual function titled “Assessment of Disability Related to Vision” (ADREV) in a population of individuals with diabetic retinopathy.
Methods
Study participants were selected chronologically from individuals receiving care within the general ophthalmology clinics and private practices of retinal-vitreous specialists at Wills Eye Institute. Inclusion/exclusion criteria included a diagnosis of either nonproliferative or proliferative diabetic retinopathy (DR), written and verbal English proficiency, no visually significant ocular comorbidity, and no active visual rehabilitation treatments at the time of study participation. Cataract assessment was made using the Lens Opacities Classification System II (LOCS II) with exclusion of patients with a 2+ cataract in any category. Study participants were not to have had retinal photocoagulation treatments within 1 month of study participation. Individuals with medical comorbidities that resulted in significant neurologic or other systemic manifestations that would have prevented them from completing the study’s protocol were also excluded. Patients were selected to reflect a broad range of visual impairment based on better eye visual acuity and were identified using a process that included a medical chart review and brief interview. All prospective patients were fully informed regarding the details of the study and those who agreed to participate completed an informed consent process that was approved by the Wills Eye Institute Institutional Review Board (IRB) and in compliance with the Declaration of Helsinki. Study participants completed a new third-generation PBM titled “Assessment of Disability Related to Vision” (ADREV). Patients were also asked to complete the 25-Item National Eye Institute Visual Functioning Questionnaire (VFQ-25). All study participants received a standard ophthalmic clinical examination that included monocular and binocular visual acuity assessment using the Early Treatment Diabetic Retinopathy Study Lighthouse Chart 2nd edition, Pelli-Robson binocular contrast sensitivity testing, monocular 24-2 Humphrey visual fields (HVF) in each eye, and a slit-lamp examination including a clinical retinal examination in both eyes. Visual acuity measurements were converted to logMAR equivalents using the methods employed by The Ischemic Optic Neuropathy Decompression Trial Research Group. Finally, demographic data were collected using a standard form and included age, gender, ethnicity, and medical comorbidities.
The ADREV instrument is based on a prior investigation involving an alternative PBM titled “Assessment of Function Related to Vision” (AFREV). The AFREV instrument validated 5 tests of performance of visually intensive tasks based on item response theory in the form of Rasch analysis and significant relationships with traditional clinical measures of opthalmic status, as well as self-reported vision-specific QoL measured by the National Eye Institute’s Visual Function Questionnaire (VFQ). The instrument employed within the context of this investigation, titled the ADREV, was developed based on the findings of the AFREV experiment and the details of its design have been documented elsewhere. The ADREV is composed of 9 tests: 1) reading in reduced illumination; 2) facial expression recognition; 3) computerized motion detection; 4) recognizing street signs; 5) locating objects; 6) ambulation; 7) placing a peg into different-sized holes; 8) telephone simulation; and 9) matching socks. A description of each test item is presented in Table 1 . Each test of performance is graded from 0 to 7 on an interval scale determined through Rasch analysis, where 0 represents the inability to perform the test and 7 indicates a perfect score. In addition to the subscale evaluations, the 9 tests are summed to produce an ADREV total score ranging from 0 to 63. The subscales can be employed and interpreted independently from the ADREV total score. Average test administration time, including patient instruction, is approximately 30 minutes. The ADREV has been previously validated in a study population involving patients with age-related macular degeneration through comparison with standard clinical measures of visual function and self-reported QoL.
Test | Description |
---|---|
1. Reading in reduced illumination | Near vision is checked by obtaining the smallest Jaeger line; then 1 at a time, 7 sentences, text size corresponding to 2 Jaeger lines above the smallest Jaeger read, are presented. Light illumination is reduced after each sentence is read. The corresponding score is as follows: 1 point, able to read at 200 foot candles (FC), 2 at 150 FC, 3 at 100 FC, 4 at 50 FC, 5 at 25 FC, 6 at 10 FC, and 7 at 5 FC. The highest score is 7 and lowest score is 0. |
2. Facial expression recognition | Seven full-face professional, colored photos of varying sizes and facial expressions (angry, sad, happy, or surprised) are presented on a computer screen at a distance of 1/2 meter. The patient receives 1 point for recognizing the right facial expression. Score ranges from 7 to 0. |
3. Computerized motion detection | A large black cross against a white background on a computer screen provides a point of fixation. While fixating on the cross, 1 at a time, 14 balls of different sizes and colors move diagonally across the screen from either the right or the left side at a constant speed. Yellow, red, or blue balls are used. The patient is asked to count the number of moving balls. Each ball seen counts as 1/2 point. Highest score is 7 and lowest score is 0. |
4. Recognizing street signs | Seven written word signs ranging from large to small are read at a distance of 4 meters. One character in each sign was changed from familiar phrases, making the word difficult to guess. For example, the top sign reads SUGAR DANE, which is similar to the more familiar sugar cane. The patient is instructed not to guess. One point is given for each sign read correctly. Highest score is 7 and lowest score is 0. |
5. Locating objects | Fourteen red and beige boxes of different sizes are scattered around the testing room (4 × 2 meters). Sample boxes are shown before test is started. The patient attempts to locate the boxes while seated. Each box found is worth 1/2 point. Highest score is 7 and lowest score is 0. |
6. Ambulation test | A 4.5-meter predefined mobility course was designed, with taped horizontal, vertical, and diagonal lines and objects made of styrofoam in the path. Several objects were also suspended from the ceiling along the path. Patients were permitted to use a mobility aid (eg, cane). The score is based on number of obstacles hit. Each obstacle successfully avoided is awarded 1/3 point. The highest score is 7 and lowest is 0. |
7. Placing a peg into different-sized holes | Seven (9 × 3 × 3/8-in) wooden boards were created with 1 hole of varying size and location. A wooden stand was created with slots to hold the boards 1 at time at different angles. The patient is asked to place the peg directly in the hole without touching the board. One point is awarded for successful completion. |
8. Telephone simulation | Seven calculators of different sizes are used to simulate dialing a telephone. The numbers are randomly rearranged to eliminate memory being used to locate the telephone numbers. The numbers are printed from different font sizes and presented to patients from largest to smallest. The patient is asked to press 7 different numbers on each of the various-sized calculators. The patient must find all 7 numbers to receive a point for that calculator. For each number correctly “dialed,” the patient receives 1 point. The highest score is 7 and lowest is 0. |
9. Matching socks | Seven differently patterned, dark-colored socks are hung on a board with a gray background. The patients are not permitted to touch the socks hanging on the wall. The patient sits in front of a table 1 meter wide so as to be 1 meter from the socks. On the table is a group of 10 socks, 7 of which are the pairs for the hanging socks. The patient is asked to match the socks on the table with those on the board. One point is awarded for each correctly matched sock. The highest score is 7 and lowest is 0. |
The VFQ was selected as the study’s primary QoL measurement as it is accepted as a reliable and valid means of studying the self-perceived impact of visual impairment upon vision-specific QoL. The VFQ is a generic vision-specific instrument that was designed to study a wide variety of ocular diseases and a number of studies have used the VFQ to study the self-reported QoL of patients with DR. The 25-item VFQ is composed of 11 vision-specific subscales that address the following domains: 1) general vision, 2) near vision, 3) distance vision, 4) ocular pain, 5) social functioning, 6) mental health, 7) role difficulties, 8) dependency, 9) driving, 10) color vision, and 11) peripheral vision. Each subscale is scored from 0 to 100, where 100 represents self-perceived perfect functioning and 0 represents the greatest level of difficulty in a given domain. The 11 subscales are also averaged to produce a VFQ total score ranging from 0 to 100. The average test administration time for the VFQ is approximately 10 minutes.
Data analysis was conducted in several steps. First, all variables were plotted and reviewed for outliers that might represent data entry errors. Descriptive statistics were computed for demographic, past medical, clinical ophthalmic, QoL, and PBM measures. Independent t tests were used to determine if any differences existed in average VFQ and ADREV total and subscale scoring based on dichotomous demographics. Analysis of variance (ANOVA) was conducted to identify statistically significant differences in mean VFQ and ADREV total and subscale scores with respect to categorical variables that lacked an inherent conceptual hierarchy. Spearman’s rho was used to determine correlative relationships between the study’s clinical variables and both the VFQ and ADREV total and subscale scores, as well as between the VFQ and ADREV total and subscale scores. Spearman’s nonparametric statistic was chosen to standardize comparisons as selected portions of the data provided by the VFQ total and subscale scores do not meet the requirements of interval variables. Correlation coefficients (r) were considered small if less than 0.3, medium if between 0.3 and 0.5, and large if greater than 0.5. In addition, a robust regression analysis using Huber’s method was used to determine the clinical measures that are most associated with each of the ADREV total and subscale scores, while controlling for age, gender, ethnicity, and total number of medical comorbidities. This approach was chosen for several reasons. Robust regression is a valid statistical method for data that meet the criteria for analysis with ordinary least squares regression (OLS); however, it is also resistant to the effects of data outliers and can analyze non-normally distributed, categorical, and binary data. All variables were entered into the regression equations simultaneously and no automated variable selection process was utilized. Prior to regression equation construction, a correlation matrix using Spearman’s rho was completed using all of the study’s independent and control variables to identify and exclude collinear relationships from the regression analysis based on correlations of r ≥ 0.9. Scatter plots were constructed between the measures of clinical ophthalmic status and the ADREV total and subscale scores in order to detect any nonlinear relationships. Independent and control variables were entered into the regression equation irrespective of the presence or absence of significant bivariate relationships with ADREV total or subscale scoring. In keeping with accepted statistical practice, the number of independent and control variables included in each regression equation was equal to or less than 1/10 the total number of cases in our sample population. Residual statistics for each regression equation were plotted to identify highly significant outliers that might have compromised the explanatory power of each regression equation. A supplementary bootstrap analysis of 1000 random resampled data sets was conducted to test the external validity of the relationships identified during initial regression modeling. Where appropriate, all statistical tests were run in a 2-tailed fashion and corrections for multiple comparisons were made using the Bonferroni method. A power analysis based on the effect sizes noted in the AFREV experiment indicated that a sample population of 90 individuals would be adequate to detect an r = 0.3 with 80% power and α = 0.05.
Results
Our final sample population consisted of 91 DR patients. The average age of participants was 61 with a standard deviation (SD) of ± 11 years. The study population had slightly unequal representation based on gender, with 35% (n = 32) being male. There were nearly equal numbers of individuals from European (n = 47) and African extraction (n = 40), with the remainder of the group being composed of 3 Hispanic individuals and 1 person of Asian descent. The average number of medical comorbidities was 2.7 (SD ± 1.6). The 5 most common comorbidities included hypertension (59.3%), hypothyroidism (20.9%), hypercholesteremia (6.6%), osteoarthritis (6.6%), and hyperthyroidism (6.6%). The population consisted of 178 eyes diagnosed with DR, with 73 eyes (41%) having a diagnosis of nonproliferative DR, while the remaining 105 eyes (59%) had a diagnosis of proliferative DR. Descriptive statistics of the study populations’ clinical ophthalmic assessments, ADREV, and VFQ total and subscale scores are presented in Table 2 . The study group included a wide range of better eye visual acuity measurements ranging from 20/20 through to no light perception. Worse eye visual field mean deviation values were equally varied and included individuals with full visual fields through to end-stage peripheral visual loss. ADREV total and subscale scores represented the full range of possible scoring. VFQ total and subscale scores were similar to the results published by Tranos and associates, but somewhat lower than those published by Klein and associates. A comparison of mean ADREV total and subscale scores with respect to gender revealed only 1 significant difference, with females scoring slightly higher (5.85) than males (5.19) on the ambulation test. There were no significant relationships between patient performance on the ADREV total and subscale scores based on age, number of comorbid medical conditions, or self-reported ethnicity.
Mean | Median | SD | Skew | Range | n | |
---|---|---|---|---|---|---|
Clinical variables | ||||||
Binocular visual acuity | .358 | .300 | 1.66 | .279 | 0–1.9 | 91 |
Better eye visual acuity | .406 | .340 | .350 | 2.65 | 0–1.9 | 91 |
Worse eye visual acuity | .829 | .620 | .682 | 1.46 | 0–2.8 | 91 |
Contrast sensitivity | 1.25 | 1.25 | .272 | −1.28 | .15–1.65 | 91 |
Mean deviation better eye | −9.81 | −7.78 | 7.16 | −1.03 | −28.96–−0.06 | 91 |
Mean deviation worse eye | −13.92 | −10.38 | −10.38 | −.893 | −40–−1.0 | 91 |
Assessment of disability related to vision total score | 44.1 | 45.8 | 9.96 | −.89 | 15–61 | 91 |
Reading in reduced illumination | 5.0 | 5.0 | 1.63 | −.62 | 0–7 | 91 |
Facial expression recognition | 4.7 | 5.0 | 1.81 | −.81 | 0–7 | 91 |
Motion detection | 5.3 | 5.0 | 1.14 | −.01 | 2–7 | 91 |
Recognizing street signs | 3.8 | 4.0 | 1.58 | −.70 | 0–6 | 91 |
Locating objects | 5.0 | 5.5 | 1.28 | −.73 | 2–7 | 91 |
Ambulation | 5.6 | 5.7 | .97 | −1.45 | 2–7 | 91 |
Placing peg into holes | 4.6 | 5.0 | 1.61 | −.47 | 0–7 | 91 |
Telephone simulation | 5.1 | 5.0 | 1.83 | −.93 | 0–7 | 91 |
Matching socks | 5.0 | 6.0 | 2.25 | −.80 | 0–7 | 91 |
Visual function questionnaire total score | 70.0 | 72.5 | 18.78 | −.66 | 20–98 | 91 |
General vision | 61.7 | 60.0 | 18.77 | −.43 | 20–100 | 91 |
Ocular pain | 78.3 | 87.5 | 20.82 | −.71 | 13–100 | 91 |
Near activities | 61.5 | 58.3 | 23.23 | −.03 | 8–100 | 91 |
Distance activities | 67.5 | 66.7 | 22.8 | −.43 | 17–100 | 91 |
Social functioning | 81.2 | 87.5 | 22.3 | −1.25 | 13–100 | 91 |
Mental health | 64.1 | 68.8 | 27.0 | −.76 | 0–100 | 91 |
Role difficulties | 59.2 | 62.5 | 32.0 | −.24 | 0–100 | 90 |
Dependency | 78.0 | 75.0 | 27.1 | −1.24 | 0–100 | 91 |
Driving | 68.9 | 75.0 | 27.1 | −1.24 | 0–100 | 54 |
Color vision | 82.7 | 100 | 22.6 | −1.12 | 25–100 | 91 |
Peripheral vision | 69.2 | 75.0 | 26.9 | −.24 | 25–100 | 91 |
Bivariate analysis between the study’s clinical variables and the ADREV total and subscale measures demonstrated that 53 of the total 60 (88%) comparisons were significant to P < .05 ( Table 3 ). After adjustment for multiple comparisons, 49 of the 60 comparisons (82%) were statistically significant to P < .0007 ( Table 3 ). After correction for multiple comparisons, all but 1 of the ADREV’s scales were correlated with 1 or more of the clinical measures of visual function, with only the ambulation test having no significant bivariate correlation to the clinical measures. The 4 strongest relationships included: 1) binocular visual acuity and ADREV total score (r = −.780); 2) binocular visual acuity and recognizing street signs (r = −.772); 3) better eye visual acuity and recognizing street signs (r = −.731); and 4) better eye visual acuity and ADREV total score (r = −.725). The mean absolute correlation magnitude for all significant relationships between the ADREV, its subscales, and all clinical measures was r = .515 (SD ± .108).
Binocular Visual Acuity (n) | Better Eye Visual Acuity (n) | Worse Eye Visual Acuity (n) | Contrast Sensitivity (n) | Better Eye Mean Deviation (n) | Worse Eye Mean Deviation (n) | |
---|---|---|---|---|---|---|
Assessment of disability related to vision | −.780 a (91) | −.725 a (91) | −.566 a (91) | .610 a (91) | .571 a (91) | .564 a (91) |
Reading in reduced illumination | −.418 a (91) | −.337 b (91) | −.280 b (91) | .143 b (91) | .190 b (91) | .169 b (91) |
Facial expression recognition | −.667 a (91) | −.637 a (91) | −.491 a (91) | .549 a (91) | .417 a (91) | .487 a (91) |
Motion detection | −.538 a (91) | −.519 a (91) | −.458 a (91) | .475 a (91) | .624 a (91) | .508 a (91) |
Recognizing street signs | −.772 a (91) | −.731 a (91) | −.404 a (91) | .603 a (91) | .559 a (91) | .436 a (91) |
Locating objects | −.636 a (91) | −.605 a (91) | −.472 a (91) | .519 a (91) | .469 a (91) | .537 a (91) |
Ambulation | −.226 b (91) | −.218 b (91) | −.031 b (91) | .153 b (91) | .105 b (91) | .137 b (91) |
Placing peg in holes | −.397 a (91) | −.411 a (91) | −.408 a (91) | .441 a (91) | .363 a (91) | .418 a (91) |
Telephone simulation | −.623 a (91) | −.560 a (91) | −.446 a (91) | .473 a (91) | .424 a (91) | .394 a (91) |
Matching socks | −.491 a (91) | −.464 a (91) | −.373 a (91) | .395 a (91) | .452 a (91) | .371 a (91) |
In contrast, the correlative analysis between the VFQ total and subscale scores with the study’s clinical measures of visual function revealed 63 of the 72 comparisons (88%) were significant to P < .05 ( Table 4 ). After adjustment for correction for multiple comparisons, 52 of the 72 comparisons (72%) were statistically significant to P < .0006. The strongest 4 relationships were: 1) better eye visual acuity and mental health (r = −.550); 2) worse eye visual acuity and VFQ total score (r = −.542); 3) worse eye MD and VFQ total score (r = .538); and 4) better eye visual acuity and VFQ total score (r = −.533). The mean absolute correlation magnitude between the VFQ, its subscales, and all of the clinical measures was r = .438 (SD ± .054). A 1-sample t test between the mean absolute correlation magnitude for all significant correlative relationships between the ADREV–clinical measures analysis and the VFQ–clinical measures analysis revealed that the higher mean correlations between the ADREV and the clinical measures was statistically significant ( P < .001).
Binocular Visual Acuity (n) | Better Eye Visual Acuity (n) | Worse Eye Visual Acuity (n) | Contrast Sensitivity (n) | Better Eye Mean Deviation (n) | Worse Eye Mean Deviation (n) | |
---|---|---|---|---|---|---|
Visual function questionnaire | −.502 a (91) | −.533 a (91) | −.542 a (91) | .497 a (91) | .470 a (91) | .538 a (91) |
General vision | −.456 a (91) | −.512 a (91) | −.396 a (91) | .378 a (91) | .317 b (91) | .437 a (91) |
Ocular pain | −.177 b (91) | −.102 b (91) | −.126 b (91) | .060 b (91) | −.005 b (91) | .053 b (91) |
Near activities | −.401 a (91) | −.476 a (91) | −.475 a (91) | .400 a (91) | .458 a (91) | .409 a (91) |
Distance activities | −.390 a (91) | −.419 a (91) | −.390 a (91) | .388 a (91) | .335 b (91) | .331 b (91) |
Social functioning | −.421 a (91) | −.425 a (91) | −.399 a (91) | .377 (91) | .377 a (91) | .414 a (91) |
Mental health | −.519 a (91) | −.550 a (91) | −.462 a (91) | .399 a (91) | .348 b (91) | .526 a (91) |
Role difficulties | −.418 a (91) | −.461 a (91) | −.476 a (91) | .392 a (91) | .389 a (91) | .390 a (91) |
Dependency | −.410 a (91) | −.405 a (91) | −.390 a (91) | .238 b (91) | .314 b (91) | .403 a (91) |
Driving | −.208 b (91) | −.288 b (91) | −.184 b (91) | .240 b (91) | .362 b (91) | .342 b (91) |
Color vision | −.372 a (91) | −.397 a (91) | −.323 b (91) | .373 a (91) | .351 b (91) | .392 a (91) |
Peripheral vision | −.311 b (91) | −.312 b (91) | −.515 a (91) | .448 a (91) | .430 a (91) | .516 a (91) |
Table 5 summarizes a direct comparison between the ADREV and VFQ total and subscale scores. Initial analysis revealed that 81 of the 120 comparisons (68%) were significant to P < .05. After correction for multiple comparisons, 42 relationships (35%) were significant to P < .0004. The 4 strongest correlations were between: 1) motion detection and VFQ total score (r = .523); 2) motion detection and near activities (r = .520); 3) facial expression recognition and VFQ total score (r = .514); and 4) facial expression recognition and mental health (r = .501). The mean absolute correlation magnitude between the ADREV and VFQ total and subscale scores was r = .431 (SD ± .048).