11 Physiological Methods in Audiology An important aspect of audiology deals with the use of physiological tests in addition to the use of behavioral measurements. Physiological measurements provide powerful diagnostic tools that supplement the information obtained from the patient interview and behavioral tests, and make it possible to test patients who are too young or otherwise incapable of responding behaviorally. This chapter introduces the student to three types of physiological assessment approaches that are used in audiology: auditory evoked potentials, otoacoustic emissions, and vestibular assessment. Acoustic immittance methods are discussed in considerable detail in Chapter 7. Physiological techniques that are no longer used (such as psychogalvanic, respiration, and cardiac responses) are not covered; but the inquisitive reader may find discussions of these methods in Bradford (1975). In addition, the student should be aware that audiologists perform physiological monitoring during surgical procedures (ASHA 1992, 2004), and they may also use nonauditory physiological measurements such as facial nerve testing and the use of somatosensory evoked potentials. While intraoperative monitoring will not covered at this introductory level, several lucid discussions of the topic are readily available (Dennis 1988; Beck 1993; Jacobson 1999; Møller 2000; Hall 2007; Martin & Shi 2007, 2009). The activity of the nervous system produces electrical signals that can be picked up by electrodes placed on the head, and can then be displayed on the screen of a recording device and/or plotted on paper. A change in the activity of the nervous system occurs when it reacts to a stimulus (such as a sound). This change in neural activity also produces a change in the electrical signals picked up by the electrodes. As a result, the nervous system’s reaction to a stimulus can be seen as a change in the electrical signals that are displayed on the recording device. These electrical responses of the nervous system that are elicited by a stimulus are called evoked potentials. When the stimulus is sound, they are called auditory evoked potentials (AEPs). These auditory evoked potentials can be used to test the integrity of the auditory system and to make inferences about hearing. One of the great advantages of AEPs is that they are usually noninvasive, almost always being measured from outside the body with electrodes on the surface of the skin. The block diagram in Fig. 11.1 illustrates a typical arrangement involved in recording AEPs from a patient. An example of an instrument used for evoked potentials testing is illustrated in Fig. 11.2. The nature and use of auditory evoked potentials have been described in considerable detail (e.g., Moore 1983; Jacobson 1985; Glattke 1993; Chiappa 1997; Burkhard & Don 2007; Burkhard, Don, & Eggermont 2007; Hall 2007; Abbas & Brown 2009; Burkhard & McNerney 2009; Don & Kwong 2009; Cacace & McFarland 2009; Cone-Wesson & Dimitrinjevic 2009; Kraus 2011; Kraus & Hornickel 2013). Fig. 11.3 shows the three major AEPs in the form of a single composite picture. The time scale in the figure is labeled latency, which is simply the amount of time that has elapsed (or the delay) since the stimulus was presented. Each of the AEPs shown is made up of a characteristic grouping of peaks and troughs that occur within a certain range of latencies, and are for this reason identified as the short, middle, and long latency responses. Notice that a logarithmic time scale is used in this figure so that all three ranges of latencies can be shown. Each of these time frames is called a time window or epoch, and the ability to observe a given evoked potential is optimized by using the time window best suited for that response. Fig. 11.1 Block diagram for measuring auditory evoked potentials, using the auditory brainstem response as an example. Fig. 11.2 Example of a clinical instrument used for auditory brainstem responses and other evoked potentials tests. (Courtesy of Grason-Stadler Inc.) We will first briefly review some aspects of electrocochleography, and then concentrate on the auditory brainstem response (ABR), which is by far the most widely used of the various auditory evoked potentials. We will then go over some aspects of the later evoked potentials. The electrodes are usually located at some distance from the structures that produce the signals. In addition, the electrodes will pick up all electrical signals that reach them, regardless of why they are there or where they are from. This means that the recording device receives all kinds of signals from the nervous system, muscles, and other physiological sources, as well as signals from electrical sources in the environment. All of these other signals are noise. As a result, we need to extract tiny evoked potential responses from an extraordinarily noisy background; for example, the auditory brainstem response is less than 1 microvolt (mV) but the noise is often around 10 mV. This goal is accomplished with filtering, differential amplification, and averaging. Filtering can be used to remove some low-frequency noises such as direct current (DC) signals from electronic equipment, 60 Hz hum from alternating current (AC) power sources, and background electroencephalographic (EEG) activity. Differential amplification is used to boost the level of the evoked potential response while at the same time removing noise. It involves using the signals picked up by two separate electrodes at different locations, such as the earlobe of the stimulated (ipsilateral) ear and the vertex of the skull. The differential amplifier then cancels (“rejects”) noises that are similar (“in common”) at the two electrodes. This process is called common mode rejection and is widely used in physiological measurements. A ground or common electrode is also necessary, and is usually located on the mastoid of the opposite ear or at midline on the lower forehead. Fig. 11.3 An idealized composite representation of the major auditory evoked potentials. The insert provides an expanded view of the short latency response because it occurs within a time period that is too small to be seen clearly in the main picture. Recall that the responses we are looking for are tiny signals embedded in all kinds of noise. A great deal of noise still remains after filtering and differential amplification. Averaging is a technique that allows the responses (evoked potentials) to be extracted from the noise, and is a central principle of many physiological methods. Averaging relies on a few fundamental notions. The first principle is that the evoked potential responses are time-locked to (or synchronized with) the stimulus. This means that the response will consistently appear in a certain way at the same latency, or point in time after the stimulus. On the other hand, noise is random. Suppose a certain evoked potential is positive in direction 3 ms after a click is presented to the ear, and is negative at a latency of 4 ms. In other words, the electrical response due to that click will be positive when measured 3 ms later and negative when measured 4 ms later. Now, suppose we present 1000 clicks, and measure the electrical signals picked up by the electrodes for a period lasting 10 ms following each one. A computer could be used to keep a “tally” of the electrical signals at 1 ms intervals after each click. More specifically, the computer will algebraically add the voltages in 1-ms intervals for all 1000 clicks. Thus, there would be an algebraic sum at latencies of l ms, 2 ms, 3 ms, 4 ms, etc. Even though any one of the responses might be very small, it will almost always be positive at a latency of 3 ms and negative at a latency of 4 ms. When these values are added algebraically (averaged) over a large number of trials (samples), a relatively large positive value will build up at 3 ms and a relatively large negative value will build up at 4 ms, as illustrated in Fig. 11.4. On the other hand, the electrical signals due to noise are random. Random events are as likely to be positive as negative, so that they will “average out” (“add up to zero” algebraically) over the long run, which is what happens when we add up a large number of samples. As a result of these principles, the averaging process causes the real response to build up because it is consistent over time, and causes the noise to “cancel out” (actually add up to zero) because it is random over time. Electrocochleography (ECochG) is the measurement of electrical potentials that are derived from the cochlear hair cells and the auditory nerve (ASHA 1987; Ruth, Lambert, & Ferraro 1988; Ferraro 2000; Hall 2007; Schoonhoven 2007; Abbas & Brown 2009). The basic methodology of ECochG involves presenting clicks to the ear and monitoring the resulting electrical responses within a time frame of ~ 5 ms after each click. Averaging the responses to a large number of stimuli results in a waveform like the one illustrated in Fig. 11.5, which will be described momentarily. Electrocochleography typically uses click stimuli, although tone bursts are also used for various applications. The reason for using transient stimuli like clicks is that many neurons must be made to fire, or discharge, at essentially the same time (synchronously) to elicit a measurable action potential. This goal is accomplished by using stimuli that have abrupt onsets, very short durations, and broad spectra, such as clicks. These characteristics enable clicks to almost simultaneously activate a large number of hair cells along the basal part of the cochlea, where the speed of the traveling wave is very fast. This, in turn, causes essentially simultaneous firing of the auditory nerve fibers associated with these basal turn hair cells. Electrode location is a very important factor in ECochG because the magnitude and quality of the measured response deteriorates significantly with distance from the cochlea and auditory nerve. The highest-quality clinical responses are obtained with an electrode on the cochlear promontory. This method is called the transtympanic approach because it involves using a needle electrode that must penetrate the tympanic membrane to get to the promontory, requiring medical participation. The transtympanic approach is thus an invasive procedure, which is its principal limitation. Less pristine but perfectly usable ECochG results are obtained with the alternative, noninvasive extratympanic approach. It uses various kinds of electrodes that are placed as close as possible to the tympanic membrane in the ear canal or on the eardrum itself. The extratympanic method avoids the limitations and potential complications of piercing the eardrum, and is the most common ECochG approach used in the United States. Fig. 11.5 An idealized electrocochleogram. Negative voltage values are plotted downward. Abbreviations: SP, summating potential; AP (N1), auditory nerve action potential; N2, second peak of auditory nerve action potential. The electrocochleogram (also abbreviated ECochG) is shown in Fig. 11.5. It includes two major components, the summating potential (SP) derived from the cochlear hair cells, and the compound action potential (AP) of the auditory nerve. It is also possible to show a third component, the cochlear microphonic, which is an alternating current (AC) electrical response from the hair cells, although this is not always done clinically. The summating potential is a shift of the electrical baseline (a direct current, or DC, shift) that occurs when the hair cells are activated. This is followed by activation of auditory neurons, which produces the action potential. Notice in Fig. 11.5 that the ongoing activity before the ECochG response is used as a baseline. The figure follows the convention of recording negative peaks downward, although some clinicians show negative values upward. The summating potential is usually seen as a displacement from the baseline just prior to the action potential, or as a hump on its leading edge, as shown in the figure. The AP is seen as a negative peak at a latency of roughly 1.5 ms after the click. It can include up to three peaks (N1, N2, and N3), but the term AP usually refers to just the first peak (N1) for clinical purposes. The ECochG response increases in amplitude (gets larger) and decreases in latency (occurs sooner) as the level of the stimulus is raised. In spite of this relationship, ECochG has not been found to be a reliable physiological method for estimating hearing thresholds, especially with extratympanic electrodes. On the other hand, ECochG has been shown to have at least three valuable clinical applications: (1) Electrocochleography is often used to help identify the first peak of the auditory brainstem response, which will be described in the next section. (2) It also can be used to monitor the status of the inner ear and auditory nerve during surgical procedures that place these structures at risk, such as acoustic tumor removal and endolymphatic sac surgery. (3) Electrocochleography is very useful in the diagnosis of Meniere’s disease (e.g., Ferraro & Krishnan 1997; Sass 1998; Ferraro & Tibbils 1999; Chung, Cho, Choi, & Hong 2004). In particular, an abnormally large SP/AP amplitude ratio (which is simply the SP amplitude relative to the AP amplitude) is a good indicator of Meniere’s disease, in contrast to cases of hair cell loss (which have low SP/AP ratios). The SP/AP ratio has roughly 60 to 70% sensitivity for the identification of Meniere’s disease, and ~ 95% specificity (e.g., Sass 1998; Chung et al 2004). A potentially more sensitive measurement for detecting endolymphatic hydrops is the SP/AP area ratio, which compares the areas of the SP and the AP instead of their amplitudes (Devaiah, Dawson, Ferraro, & Ator 2003). The area of the SP is basically its amplitude (vertical size) times its duration (horizontal size), and the area of the AP is also its amplitude times its duration. The group of waves identified as the short latency response in Fig. 11.3 was originally described in detail by Jewett, Romano, and Williston (1970), as well as by Sohmer and Feinmesser (1967). They include up to seven peaks that normally occur within ~ 8 ms following the onset of a click stimulus. It is tempting to attribute these peaks to successive neural sites along the auditory pathway. However, while it does appear that the first two peaks are produced by the auditory nerve, the subsequent peaks actually have multiple generators, meaning that they are due to the combined electrical activity of several nuclei in the auditory brainstem (Møller & Jannetta 1985; Scherg & von Cramon 1985; Moore 1987; Rudell 1987; Hall 2007; Møller 2007). These short-latency evoked potentials are generally known as the auditory brainstem response (ABR) or the brainstem auditory evoked response (BAER), and are sometimes referred to as the brainstem evoked response (BSER) or the brainstem auditory evoked potential (BAEP). The auditory brainstem response is most commonly obtained using click stimuli for the same reason described above for electrocochleography. As a result, the ABR depends to a considerable extent on the status of the basal turn of the cochlea and principally involves the high frequencies. While their abruptness and broad spectra make clicks optimal stimuli for eliciting synchronous neural firings, these features also cause the ABR to be lacking in the ability to test on a frequency-by-frequency basis. The ability to distinguish among frequencies is often called frequency specificity, and this lack of frequency specificity in click-evoked ABR testing must always be kept in mind. Frequency specificity is usually achieved in ABR testing by using tone bursts instead of clicks, either alone or in combination with masking techniques.1 A tone burst is basically a very brief tone that rapidly rises to 100% of its intended amplitude in a few periods of the fundamental frequency and then rapidly dies out after a few periods. For example, each tone burst might rise in amplitude for two periods, have one period at 100% amplitude, and then fall to zero in two periods (Fedtke & Richter 2007; ISO 389-6 2007). Tone burst ABRs are used when estimating a patient’s hearing thresholds at different frequencies, and are thus often used in the audiological assessment of babies and others who are difficult to test using behavioral methods (see Chapter 12). 1 Various masking techniques have also been used. For example, one method masks the high frequencies so the response is more likely to come from lower frequencies (e.g., Kileny 1981). The notched-noise method masks all frequencies except a certain narrow range where there is a “hole” or “notch” in the noise; thus, the response is from the frequency range where the masking noise is missing (e.g., Stapells et al 1990; Stapells & Kurtzberg 1991). The derived band method involves combining ABRs obtained from various combinations of noises and signals; and the responses are, in effect, subtracted from one another to derive a frequency-specific response (e.g., Parker & Thornton 1978; Don et al 1979). An international standard (ISO 389-6 2007) contains reference values that have been developed for clicks (Richter & Fedtke 2005) and tone bursts (Fedtke & Richter 2007) that are analogous to ones described for audiometers in Chapter 4. The reference value for a click and each tone burst is expressed as its peak-to-peak reference equivalent threshold sound pressure level (peRETSPL) for use with earphones, and its peak-to-peak reference equivalent threshold vibratory force level (peRETVFL) for use with bone-conduction vibrators. Examples of peRETSPLs for clicks and several commonly used tone bursts are illustrated in Table 11.1. Another physical approach involves measuring the click’s peak sound pressure level (peakSPL). This is done by directing the click from the earphone through the appropriate calibration coupler into a precision sound level meter that is capable of measuring the true peak level of a transient signal. However, sound level meters of this type are not available in most clinical settings. Table 11.1 Examples of peak-to-peak reference equivalent threshold sound pressure levels (peRETSPLs) for clicks and tone burstsa
Auditory Evoked Potentials
Electrocochleography
Auditory Brainstem Response
Test Signals
Signal | peRETSPL in dB (re: 20 µPa) | |
Telephonics TDH-39 supra-aural earphone | Etymotic research ER-3A insert receiver | |
Clicks | 31.0 | 35.5 |
250 Hz tone bursts | 32.0 | 28.0 |
500 Hz tone bursts | 23.0 | 23.5 |
1000 Hz tone bursts | 18.5 | 21.5 |
2000 Hz tone bursts | 25.0 | 28.5 |
4000 Hz tone bursts | 27.5 | 32.5 |
a Based on ISO 389-6 (2007), Richter & Fedtke (2005), and Fedtke & Richter (2007) for 1-second trains of clicks and tone bursts repeating at a rate of 20 Hz. Correction values are required for faster and slower repetition rates.
It is also common to find the stimulus levels expressed in behavioral terms. A widely used approach expresses click intensity in decibels of normal Hearing Level (nHL), which is based on a local norm corresponding to 0 dB HL for each click and tone burst. The procedure involves obtaining behavioral thresholds for each signal for a group of young, normal-hearing individuals. Each person is tested to find the lowest hearing level dial level on the ABR instrument where the clicks or tone bursts are just audible. This dial setting constitutes that person’s threshold for that click or tone burst, and the average for the group becomes 0 dB nHL.
Another behavior approach is done on an individual-patient basis by determining the patient’s own behavioral threshold for clicks, and then expressing the click intensity in sensation level (dB SL). However, this approach is limited because it is difficult, at best, to assess hearing without knowing the physical level of the stimulus. Moreover, ABR testing is often done on patients who cannot be tested behaviorally, in which case the sensation level of the stimulus cannot be determined.
The ABR Waveform
An idealized ABR waveform is shown in Fig. 11.6. This figure shows positive peaks that are recorded in the upward direction and numbered from I to VII, which is the most common convention in the United States. However, other recording conventions also exist. As already mentioned, waves I and II are generated by the auditory nerve and correspond to the N1 (AP) and N2 peaks of the electrocochleogram. Even though these first two peaks were plotted downward in the ECochG, they are now flipped upward so they appear in the same direction as the rest of the ABR peaks. This inversion of waves I and II occurs during the process of differential amplification, and is very convenient because it causes all of the ABR peaks to be plotted in the same direction. Before proceeding, notice that only one curve is shown. In actual practice, two sets of tracings would be done because evoked potentials must be replicated to confirm that the results obtained are real.
Fig. 11.6 An idealized auditory brainstem response (ABR). Arrows indicate the wave I, II, and III absolute latencies and amplitudes, and the interwave latencies between waves I and V, I and III, and III and V.
Clinical ABR measurements are concerned with the first five peaks (I to V), and concentrate on peaks I, III, and V. The ABR waveform is usually described and interpreted in terms of the latencies and amplitudes of these peaks, as well as by its morphology, or the overall configuration and appearance of the waveform. A given wave’s absolute latency is simply the time delay from 0 ms (when the click is presented) until its peak occurs.
Electrocochleography is sometimes used to locate wave I when it is not identifiable on the ABR. The time interval between two peaks is called an interwave latency or relative latency. Interwave latencies are usually measured between waves I and V, I and III, and III and V. These latency measurements are illustrated in the figure by the horizontal arrows. The vertical arrows show how to measure the amplitudes for waves I, II, and V. Wave V is the most prominent and robust of these peaks, and is so closely associated with wave IV that one must speak of a IV/V complex. There is, however, considerable variability in the morphology of normal ABR waveforms, particularly with respect to the configuration of the IV/V complex.
The ABR waveforms shown in the two previous figures were all obtained with clicks presented at fairly high intensity levels. They would have looked different if the clicks had been presented at progressively lower intensities, and would eventually disappear when the clicks went below threshold. In other words, the characteristics of the ABR depend on the level of the stimulus.
Fig. 11.7 shows a series of ABR results obtained from a normal individual with clicks presented at 80, 60, 40, and 20 dB nHL. Notice that the characteristics of the ABR waveform change considerably as the intensity of the clicks is decreased. Another series of ABR waveforms is shown across the upper part of Fig. 11.8, where intensity increases from left to right along the x-axis. As the stimulus intensity gets lower, the peak latencies become longer and their amplitudes become smaller. The latency shift is seen most vividly by the rightward shift of wave V as the intensity drops progressively from 80 dB nHL down to 20 dB nHL in Fig. 11.7. Also, the earlier peaks become less distinctive and eventually disappear with progressively lower stimulus levels. Even though wave V becomes progressively smaller and later with decreasing intensity, it is generally still discernible at levels as low as the behavioral threshold for the click, which is typically down to 0 dB SL or 0 dB nHL, or roughly 35 dB peakSPL, for a normal person. The ABR is finally undetectable at levels below the behavioral threshold. Fig. 11.8 also shows a graph that plots the manner in which wave V latency changes as a function of stimulus (click) level. Such a graph is called a latency-intensity function, and it reveals the manner in which the wave V latency changes as a function of stimulus (click) level. Such a graph is called a latency-intensity function, and it reveals the manner latency decreases as stimulus intensity increases.
Fig. 11.7 A series of click-evoked auditory brainstem responses from a normal adult obtained at various stimulus levels. Wave V indicated on each tracing. (From Arnold [2000], with permission.)
Clinical Use of the Auditory Brainstem Response
The ABR is a very valuable clinical tool for several reasons. (1) Auditory brainstem responses are measurable in everyone who is normal, including newborns. (2) The ABR is not affected by the patient’s state of arousal, or by the use of sedation or anesthesia. As a result, ABR testing can be done with or without the cooperation of the patient, and even when the patient is unconscious or under general anesthesia. The ability to perform the ABR on a patient under sedation makes it possible to assess young and/or difficult-to-test children who could not otherwise be evaluated. It should be stressed, of course, that sedation is a medical responsibility. In addition, the ABR is also used in intraoperative monitoring during surgical procedures that jeopardize the eighth nerve, such as acoustic tumor removal. (3) The ABR can be used to assess hearing because it is affected by hearing loss. (4) Different abnormalities affect the ABR in different ways, so that it can be used for differential diagnosis.
Just because the ABR is ubiquitous does not mean that its characteristics are the same for everyone. On the contrary, maturation, gender, and aging need to be considered when developing norms and interpreting the results.
The ABR is present but not adult-like in newborns, and its character changes with the infant’s maturation (Hecox & Galambos 1974; Fria 1980; Chiappa 1997; Hurley, Hurley, & Berlin 2005; Hall 2007; Sininger 2007). For example, waves I, III, and V are observable in newborns, but the absolute latencies of waves III and V are prolonged relative to adult values, as are the interwave latencies. As the infant matures, the other peaks emerge, the latencies of the waves shorten, and their amplitudes change, eventually achieving adult characteristics by roughly 18 months of age.
Among adults, the ABR is affected by gender and aging (Stockard, Stockard, Westmoreland, & Corfits 1979; Jerger & Hall 1980; Jerger & Johnson 1988). Compared with males, females tend to have shorter absolute latencies and larger amplitudes for waves III, IV, and V, as well as shorter interwave latencies. Also, it appears that the degree of cochlear impairment has a greater effect on wave V latencies for men than for women (Jerger & Johnson 1988). The effect of aging is not as clear-cut, but its absolute latencies appear to become slightly longer with advancing age.
Fig. 11.8 A latency intensity function corresponding to the series of click-evoked ABRs shown in Fig. 11.7. The waveforms are superimposed on the graph to highlight the manner in which waveform morphology, latency, and intensity change with the level of the stimulus. (Adapted from Arnold SA [2007]. The auditory brain stem response. In: Roeser RJ, Valente M, Hosford-Dunn H, eds. Audiology Diagnosis, 2nd ed. New York, NY: Thieme; 426–442.)
The earlier discussion about how the ABR is affected by stimulus intensity also reveals the procedure for estimating a patient’s thresholds with the ABR. The basic procedure is to obtain a series of ABRs at progressively lower intensities until the level is reached where a replicable response is no longer discernible. This usually involves finding the lowest level where wave V can be identified. Normal hearing is implied when a response can be identified at stimulus levels as low as ~ 0 dB nHL. The word implied is used because physiological measures are not direct hearing tests in the sense that they do not reveal whether the patient is able to respond to sounds behaviorally. Rather, they are tests of the integrity of the structures and processes involved in hearing, and in the case of the ABR, these responses are coming from just the lower portions of the auditory pathways. In spite of these caveats, it is clear that ABR thresholds provide valuable information about the hearing of patients who cannot respond behaviorally, such as infants and the difficult-to-test. In fact, the ABR is widely used for infant screening purposes as well as for the diagnostic assessment of this population (Chapters 12 and 13). Click ABR thresholds are related to high-frequency behavioral thresholds. However, the tone burst ABR is used to estimate patients’ hearing sensitivity because it provides frequency-specific thresholds. Tone burst ABR thresholds are typically within ~ 10 to 15 dB of behavioral thresholds at various audiometric frequencies, and correction factors are often applied to the ABR thresholds to achieve better estimates (Stapells, Gravel, & Martin 1995; Stapells 2000; Gorga et al 2006; Rance, Tomlin, & Rickards 2006; Sininger, Abdala, & Cone-Wesson 1997; Vander Werff, Prieve, & Georgantas 2009).
An individual patient’s latency-intensity functions are compared with normative values to make inferences about the nature and degree of her hearing loss. Several examples are shown in Fig. 11.9. The normal range of wave V latencies is shown by a pair of curved lines in each panel. Each facility should develop a set of normal ranges that applies to its own equipment and procedures. The leftmost symbol of each individual latency-intensity function is obtained from the lowest click level that produces a replicable ABR, and therefore also represents the threshold for clicks. These points occur at 5 dB nHL for the normal case and 45 dB nHL for conductive loss in the upper panel of the figure, and at 30, 35, and 45 dB nHL for the examples of cochlear impairments in the lower panel.
The typical wave V latency-intensity functions associated with normal hearing, conductive losses, and sensorineural losses of cochlear origin are quite different. As we would expect, the normal latency-intensity function falls within the normal limit lines (orange circles in Fig. 11.9a). Conductive losses reduce the amount of signal intensity reaching the cochlea. For this reason, they tend to have latency-intensity functions that are essentially displaced horizontally to the right (higher click levels) by roughly the amount of the conductive loss (green circles in Fig. 11.9a). On the other hand, cochlear impairments typically have wave V latencies that are elevated at and slightly above threshold, and then converge to the normal latency range as the click intensity is raised (circles in Fig. 11.9b). However, whether this pattern occurs depends on the configuration of the hearing loss because the ABR relies heavily on the basal (high-frequency) portion of the cochlea (Yamada, Kodera, & Yagi 1979). Notice in Fig. 11.9b that the wave V latency-intensity function is abnormal when the sensorineural hearing loss involves the high frequencies (circles), but can actually be within the normal range in cases of low frequency and relatively mild flat sensorineural loss, where the high frequencies are preserved (triangles).
Fig. 11.9 Representative clinical wave V latency-intensity functions. The paired curved lines in each panel are normal confidence limits. (a) Results for a normal ear and a case of conductive loss. Notice the conductive function is shifted to the right, represented by the arrow. (b) Functions for cochlear sensorineural loss generally converge toward the normal latency range as click intensity is raised, but this is affected by the shape and degree of loss.
The different latency-intensity functions associated with conductive and cochlear impairments allow the ABR to help us discriminate between these two kinds of hearing losses. However, conductive losses may have to exceed 35 dB to be reliably distinguished from sensorineural impairments with the ABR (van der Drift, Brocaar, & van Zanten 1988a,b).
Another way to use the ABR to identify the type of loss is to compare the results obtained when the clicks are presented by air-conduction versus bone-conduction. When bone-conduction ABRs are done, one must be mindful that (1) the highest usable click levels for bone-conduction ABR testing are limited to ~ 50 dB nHL, (2) bone-conduction wave V latencies are roughly 0.5 ms longer than they are by air-conduction, and (3) appropriate masking of the contralateral ear should be employed (Mauldin & Jerger 1979; Weber 1983; Schwartz, Larson, & De Chicchis 1985; Gorga & Thornton 1989).
Because the ABR reflects the activity of the auditory nerve and brainstem pathways, it is not surprising that it can be used to identify retrocochlear pathologies like acoustic tumors. The identification of retrocochlear disorders from the ABR has an overall sensitivity of ~ 95% (Turner, Shepard, & Frazer 1984). The identification of retrocochlear disorders from the ABR involves interpreting peak measurements and waveform morphology. The following ABR findings are associated with retrocochlear abnormalities, several of which may be found in Fig. 11.10:
• Prolonged latency for wave V.
• Prolonged interwave latency for I to V (as well as for I to III and/or III to V).
• Interaural latency differences. Significant differences between the patient’s two ears are considered for both the wave V latency and the interwave latencies.
• Absence of the later waves.
• Absence of an ABR even though hearing is normal or only mildly impaired.
• An ABR waveform that is not replicable.
• Abnormally low V:I amplitude ratio. The V:I amplitude ratio is simply the amplitude of wave V over the amplitude of wave I, and is expected to be ≥ 1.0 because wave V is normally larger. One becomes suspicious of a retrocochlear disorder when the V:I ratio is less than 1.0, but this criterion is not as sensitive as the latency measurements.
• Significant shifting of wave V latency when the clicks are presented at a faster rate, although the usefulness of this criterion is controversial.
An absent or grossly abnormal ABR with normal outer hair cell functioning demonstrated by otoacoustic emissions and/or cochlear microphonics (discussed elsewhere in this chapter) is also associated with auditory neuropathy spectrum disorder (e.g., Starr, Picton, Sininger, Hood, & Berlin 1996; Hood 2007; see Chapter 6).
Although standard ABR testing has excellent sensitivity for acoustic tumors overall, it has been disappointing when trying to identify small tumors (those ≤ 1 cm in size; e.g., Chandrasekhar, Brackmann, & Devgan 1995; Schmidt, Sataloff, Newman, Spiegel, & Myers 2001). However, an approach called the stacked auditory brainstem response (stacked ABR) has been quite successful at identifying these small tumors even when they are missed by the standard ABR (e.g., Don, Masuda, Nelson, & Brackmann 1997; Don & Kwong 2002; Don, Kwong, Tanaka, Brackmann, & Nelson 2005). For example, Don et al (2005) found that the stacked ABR had 95% sensitivity for a group of 54 patients with acoustic tumors that were missed by the standard ABR or were ≤ 1 cm in size, as well as 88% specificity for 78 tumor-free control subjects with normal hearing. The stacked ABR method uses the derived-band approach described earlier to obtain several ABRs derived from different frequency locations along the cochlea. These derived ABR waveforms are then shifted so that they are aligned at the wave V peak, and summed to arrive at the stacked ABR. The wave V amplitude of this stacked response is then measured and can then be compared with normative values.
Later Auditory Evoked Potentials
The ABR is by far the class of auditory evoked potentials that is most widely used by audiologists, sometimes to the exclusion of all others. Yet other kinds of AEPs are not only available but also provide information not obtainable with the ABR. Fig. 11.3 identifies these as the middle latency response (MLR) and the long latency response (LLR) (McPherson & Ballachanda 2000; Hall 2007; Pratt 2007; Martin, Tremblay, & Stapells 2007; Cacace & McFarland 2009). The major advantage of these responses is that they can provide frequency-specific information about hearing sensitivity. Their major disadvantage is that they are significantly affected by the state of the patient and are altered or obliterated by drugs (including sedatives and anesthetics), which curtails their usefulness with young children and other difficult-to-test patients.
The middle latency response is a series of negative (N) and positive (P) waves occurring at latencies between 15 and 50 ms, identified as Na, Pa, Nb, and Pb (Fig. 11.3). It appears to reflect neural activity originating from several cortical and subcortical locations involving the midbrain, reticular formation, and the thalamocortical pathways. The principal clinical contribution of the MLR is that it can be elicited by relatively low-frequency tone bursts, such as 500 or 1000 Hz, which would not be successful stimuli with the ABR. As a result, the MLR can be used successfully to assess low-frequency hearing sensitivity. It is also useful in the diagnosis of central auditory nervous system abnormalities.
Fig. 11.10 Two examples of abnormal ABR results in cases of retrocochlear pathology. (Adapted from ASHA [1987], with permission of American Speech-Language-Hearing Association.)