Highlights
- •
A naïve Bayes classifier for keratoconus detection is presented.
- •
It uses primary Placido-based corneal indices, described in literature and computed directly from the image of the disks reflected on the cornea.
- •
The classifier has been tested and compared with deterministic indices on a population of 60 corneas, as well as under artificially noisy data.
- •
It showed excellent discrimination capability and robustness.
- •
The developed approach is simple, platform-independent and reproducible on any Placido-based topographer.
Abstract
Purpose
To evaluate in a sample of normal and keratoconic eyes a simple Bayesian network classifier for keratoconus identification that uses previously developed topographic indices, calculated directly from the digital analysis of the Placido ring images.
Methods
A comparative study was performed on a total of 60 eyes from 60 patients (age 20–60 years) from the Department of keratoconus of INVISION Ophthalmology clinic (Almería, Spain). Patients were divided into two groups depending on their preliminary diagnosis based on the classical topographic criteria: a control group without topographic alteration (30 eyes) and a keratoconus group (30 eyes). The keratoconus group included all grades except grade IV with excessively distorted corneal topography. All cases were examined using the CSO topography system (CSO, Firenze, Italy), and primary corneal Placido-indices were computed, as described in literature. Finally, a classifier was built by fitting a conditional linear Gaussian Bayesian network to the data, using the 5- and 10-fold cross-validation. For comparison, the original data were perturbed with random white noise of different magnitude.
Results
The naïve Bayes classifier showed perfect discrimination ability among normal and keratoconic corneas, with 100% of sensibility and specificity, even in the presence of a very significant noise.
Conclusions
The Bayesian network classifiers are highly accurate and proved a stable screening method to assist ophthalmologists with the detection of keratoconus, even in the presence of noise or incomplete data. This algorithm is easily implemented for any Placido topographic system.
1
Introduction
The most common technology used to measure corneal topography is the Placido disk system , based on projecting an illuminated pattern of concentric rings or mires onto the surface of the cornea. The image of these rings is captured and digitized along a fixed number of meridians, which provides several thousand of points in a close-to-concentric pattern. The software of the topographer processes this data to yield elevation, curvature and other parameters using either publicly available (see e.g. ) or proprietary algorithms. The result is usually represented as color maps, which permits subjective, qualitative analysis of the data; topographic indices aim at providing some objectivity to this analysis. They typically focus on either the entire corneal surface (whole cornea indices) or on a specific subarea (regional indices), and return one or several numerical values characterizing the topography. Since a single index rarely captures the whole complexity of the cornea, composite indices are calculated combining two or more indices.
Many of these indices were developed specifically for early detection and classification of keratoconus (KC), as well as for discrimination between KC and other abnormalities such as contact lens-induced warpage and other forms of irregular astigmatism. Recall that KC is a pathology that produces a cone-shaped deformity in the cornea as a result of degeneration of corneal stroma tissue and subsequent bio-mechanical alterations . This is the cause of a significant increase of the anterior corneal irregularity and a deterioration of the visual quality, since this first surface of the cornea has a very relevant optical function in the eye.
The above mentioned indices take into account the geometry and optical properties of the anterior corneal surface , although with the widespread use of Scheimpflug cameras more KC characterizations use also pachimetry . There have been attempts to complement them with more sophisticated tools such as neural networks .
In , a set of indices that measure the irregularity of the anterior corneal surface, computed directly from the image of the Placido disks reflected on the cornea, was put forward. This approach was platform-independent and allowed bypassing the surface or curvature reconstruction step that is currently performed by the software of every commercial Placido topographer. Several basic (or primary) indices were built directly from easy-to-measure parameters of the digitized images of the rings reflected on the cornea. Also, compound metrics were proposed (such as the generalized linear model or the classification trees) by combining some of the primary indices to improve their efficiency.
One of the conclusions of was that the primary indices allow to discriminate, with excellent accuracy, between normal eyes and eyes with keratoconic corneas. As expected, combined indices were even more efficient. Nevertheless, both the generalized linear model and the classification trees have a natural intrinsic weakness: they deal poorly with noisy or incomplete data, something that occurs very often in clinical practice. The goal of this work is to replace these metrics with a new combined index that based on the values of the primary indices from calculated from possibly noisy data, could yield no less accurate but more robust results. This was made possible by considerable recent advances experiences in the fields of computational statistics, artificial intelligence and machine learning.
Bayesian networks (BNs) are a flexible probabilistic tool that has been successfully used for solving complex problems, including inference, classification and regression . They consist of a qualitative and a quantitative component. The former is a directed acyclic graph (DAG), which represents the dependencies between the variables; the latter is a set of conditional probability distributions, from which the joint probability distribution can be recovered. The network structure (the DAG) can be learned automatically or fixed manually using expert knowledge about the problem. A typical DAG structure is the naïve Bayes (NB) model, which has been extensively used in classification problems due to its good results in terms of accuracy . In a NB classifier, there is a class variable which is the only parent of the rest of variables ( explanatory variables ) in the DAG. This model works on Bayes theorem of probability to predict the class data set assuming conditional independence among predictors, given the class. Naïve Bayes model is fast, easy to build, requires only a small number of training data to estimate its parameters, and is known to outperform even some highly sophisticated classification methods .
In this paper, the topographic indices from were used to build a BN, or more precisely, a naïve Bayes classifier, and to evaluate its usefulness for the detection of corneal topographic keratoconus, comparing its accuracy and robustness to faulty measurements with the compound indices form , using for that both synthetic and real patients data.
2
Patients and methods
2.1
Patients
This study uses data of 60 eyes from 60 patients aged 20–60 years (mean 33.86 ± 11.46 years) from the Department of Keratoconus of INVISION Ophthalmology clinic (Almería, Spain). They were divided into two groups according to whether they were diagnosed with KC or not: a control group (30 eyes) and a KC group (30 eyes). Inclusion in the KC group was based on the criteria established for the diagnosis of this corneal pathology and the absence of ocular surgical history that could have modified it. The diagnostic criteria were: asymmetric bow tie in the topographic image and at least one sign of keratoconus in the examination with the slit lamp, such as stromal thinning, conical protrusion of the cornea at the apex, Fleischer ring, Vogt striae or anterior stromal scar. For contact lens wearers it was recommended to stop wearing them for two weeks for soft contact lenses and four weeks for gas permeable rigid contact lenses. The exclusion criteria were the existence of any ocular pathology and the presence of advanced KC (grade 4 according to the Alió-Shabayek classification system ). In cases of unilateral keratoconus, the affected eye was always included in the study. For patients with bilateral keratoconus, only one eye was randomly selected for the study.
The control group included eyes with normal corneal topography with no signs of irregularity according to the indices mentioned above, without ocular pathology or previous ocular surgery. In this control group, only one eye from each patient was selected for inclusion in the study. All patients were informed about the objectives of the study, voluntarily agreed to their participation and signed an informed consent document in accordance with the Declaration of Helsinki.
For all patients, corneal topographic analysis was carried out with CSO topography system (CSO, Firenze, Italy). This topographer analyses a total of 6144 corneal points of a corneal area delimited by a circular ring defined by an interior radius and an exterior radius of 10 mm with respect to the corneal vertex. The software of this system, EyeTop2005 (CSO, Florence, Italy), allows access to the raw data and their export to an ASCII file, necessary for the calculation of the primary corneal indices described next.
2.2
Corneal indices
In , a methodology for building metrics based directly on the image captured by the digital camera of the Placido-based topographers was put forward, and a set of corneal indices was introduced as an additional tool for keratoconus (and in general, cornea irregularity) detection and classification. Later their performance was analyzed in . Some of these indices are described next for the reader’s convenience.
At the beginning, the digitized points captured by the camera of a Placido disk corneal topographer are grouped in mires; in defining the indices, only data from complete rings are used, limiting the number of rings to the maximum of N ≤ 15. For each mire, numbered by the integer value 1 ≤ k ≤ N starting from the innermost one, the best-fit circle was calculated, namely its radius, denoted by AR ( k ) (from “average ring radius” of the k th ring), and the position of its center C k , in Cartesian coordinates. In what follows, only the radii AR (1) and AR (4) of the 1 st and 4 th mires are used. Additional four primary indices were calculated, the first three labeled as PI n (from “Placido Irregularity indices”):
- 1
The diameter of the set of centers C k (normalized by the total number of rings N ),
<SPAN role=presentation tabIndex=0 id=MathJax-Element-1-Frame class=MathJax style="POSITION: relative" data-mathml='PI1=1Nmax1≤n,m≤N∥Cn−Cm∥,’>PI1=1?max1≤?,?≤?∥??−??∥,PI1=1Nmax1≤n,m≤N∥Cn−Cm∥,
PI 1 = 1 N max 1 ≤ n , m ≤ N ∥ C n − C m ∥ ,
where ∥·∥ is the standard Euclidean norm in <SPAN role=presentation tabIndex=0 id=MathJax-Element-2-Frame class=MathJax style="POSITION: relative" data-mathml='ℝ2′>ℝ2ℝ2
ℝ 2
. PI 1 represents the maximum distance between the centers of individual mires. Thus, large values of PI 1 correlate with substantial asymmetry and off-centering of the rings.
- 2
The total drift or deviation in the consecutive centers,
<SPAN role=presentation tabIndex=0 id=MathJax-Element-3-Frame class=MathJax style="POSITION: relative" data-mathml='PI2=1N−1∑1≤n≤N−1∥Cn+1−Cn∥.’>PI2=1?−1∑1≤?≤?−1∥??+1−??∥.PI2=1N−1∑1≤n≤N−1∥Cn+1−Cn∥.
PI 2 = 1 N − 1 ∑ 1 ≤ n ≤ N − 1 ∥ C n + 1 − C n ∥ .
PI 2 represents the length of the path connecting the consecutive centers of individual mires. Large values of PI 2 mean an important drift in the centers’ positions, and this index is therefore another measure of asymmetry.
- 3
For the third index the data from mires are fit with an ellipse, and the dispersion of the values of the axis ratios r k (see for details) is measured, defining
<SPAN role=presentation tabIndex=0 id=MathJax-Element-4-Frame class=MathJax style="POSITION: relative" data-mathml='PI3=1N∑1≤k≤N(rk−r¯)2,wherer¯=1N∑1≤k≤Nrk.’>PI3=1?∑1≤?≤?(??−?⎯⎯)2‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√,where?⎯⎯=1?∑1≤?≤???.PI3=1N∑1≤k≤N(rk−r¯)2,wherer¯=1N∑1≤k≤Nrk.
PI 3 = 1 N ∑ 1 ≤ k ≤ N ( r k − r ¯ ) 2 , where r ¯ = 1 N ∑ 1 ≤ k ≤ N r k .
In other words, PI 3 measures the variability in the eccentricity of the mires approximated by ellipses.
- 4
SL stands for the absolute value of the slope of the standard linear regression of the coordinates of the centers C k , so that high values of SL correspond to a vertical alignment of the centers. SL measures the orientation of the path connecting the centers, so that it has high values when the centers’ drift is almost vertical and low values when it is horizontal.
Additionally, a combined metric called GLPI (from Generalized Linear Placido Irregularity index) was introduced in and improved later in . It uses the indices PI 1 , PI 3 , AR (4) and SL as input, and takes continuous values between 0 and 100 (0% corresponding to a totally normal, and 100%, to a totally altered cornea), see for details.
2.3
Bayesian networks
A graph is a set of vertices or nodes and a set of edges or arcs connecting them. A directed graph is one in which all the arcs have a direction (i.e., they point from one node to another). A directed graph is acyclic (or in short, DAG) when it does not contain loops or closed paths (cycles), as the nodes are traversed following the direction of the edges, see e.g. Fig. 1 . A Bayesian network is a compact representation of the joint probability distribution over a set of variables <SPAN role=presentation tabIndex=0 id=MathJax-Element-5-Frame class=MathJax style="POSITION: relative" data-mathml='X=X1,…,Xn’>?=(?1,…,??)X=X1,…,Xn
X = X 1 , … , X n
whose independence relations are encoded by the structure of an underlying DAG . Formally, a BN is defined as a pair ( G , P ), where G is a DAG and P is a set of conditional probability distributions (CPDs). G is composed of nodes that represent random variables ( X ), and links between pairs of nodes representing statistical dependence. Each node X i has an associated probability distribution <SPAN role=presentation tabIndex=0 id=MathJax-Element-6-Frame class=MathJax style="POSITION: relative" data-mathml='pXi|PaXi’>?(??|Pa(??))pXi|PaXi
p X i | Pa X i
, where Pa( X i ) denotes the parents of X i in the DAG G . The joint probability distribution over all the variables in the network can be recovered as the product of the CPDs attached to each node (chain rule):
pX1,…,Xn=∏i=1npXi|PaXi,forXi∈Ωi,i=1,…,n,