Possibilities for a Closer Mimicking of Normal Auditory Functions with Cochlear Implants

5


Possibilities for a Closer Mimicking of Normal Auditory Functions with Cochlear Implants


Blake S. Wilson, Reinhold Schatzer, and Enrique A. Lopez-Poveda


Recent advances in electrode and stimulus design have increased the level of control that implants can exert over spatial and temporal patterns of responses in the auditory nerve. The advances include perimodiolar placements of electrodes, use of high-rate carriers or high-rate conditioner pulses, and current steering to produce “virtual channels” or intermediate sites of stimulation between adjacent electrodes. All but the last of these are reviewed in Wilson et al (2003a). Virtual channels and their construction are described in Wilson et al (1994b), and later in this chapter.


The higher levels of neural control might be exploited to provide a closer mimicking with implants of the signal processing that occurs in the normal cochlea. In particular, the subtleties of the normal processing might be represented at the auditory nerve using high-rate carriers, spatially selective electrodes, virtual channels, or combinations of these.


Present processing strategies for implants, such as the continuous interleaved sampling (CIS) strategy shown in the top panel of Fig. 5–1, provide only a very crude approximation to the normal processing. For example, a bank of linear band-pass filters is used instead of the highly nonlinear and coupled filters that would model the behavior of the basilar membrane (BM) and associated structures (e.g., the outer hair cells) in the intact cochlea. In addition, a single nonlinear mapping function is used in the CIS and other strategies to produce the overall compression (from the dynamic range of sound pressure variations to the dynamic range of stimuli for single neurons) that the normal system achieves in multiple steps. The compression in CIS and other processors is instantaneous, whereas compression at the synapses between inner hair cells (IHCs) and single fibers of the auditory nerve in the normal cochlea is noninstantaneous, with large adaptation effects.


Such differences between normal processing and what current implants provide may limit the perceptual abilities of implant patients. For example, Deng and Geisler (1987), among others, have shown that nonlinearities in filtering at the BM and associated structures greatly enhance the neural representation of speech sounds presented in competition with noise. Similarly, findings of Tchorz and Kollmeier (1999) have indicated the importance of adaptation at the IHC/neuron synapse in representing temporal events or markers in speech, especially for speech presented in noise. Reception of sounds more complex than speech, for example, symphonic music, may require the full interplay and function of the many processing steps in the normal auditory periphery.


A thorough discussion of the intricacies of signal processing in the normal cochlea is presented in Wilson et al (2003a). This discussion also includes a detailed description of how current processing strategies for implants fail to reproduce or replicate many aspects of the normal processing.


This chapter suggests a general approach for moving implants toward normal processing and describes the first steps in developing this approach. In addition, preliminary data are presented that show promise for the approach and some of the tested variations.



image

Figure 5–1 Two approaches to speech processor design. Top panel: A block diagram of a standard continuous interleaved sampling (CIS) design. Bottom panel: A block diagram of a new approach aimed at providing a closer mimicking of processing in the normal cochlea. Possible models that could be utilized in a “closermimicking” processor are listed beneath the corresponding blocks. BPF, band-pass filter; EL, electrode; IHC, inner hair cell; LPF, low-pass filter; Pre-emp., preemphasis filter; Rect., rectifier. [Top panel: adapted from Wilson BS, Finley CC, Lawson DT, Wolford RD, Eddington DK, Rabinowitz WM. (1991). Better speech recognition with cochlear implants. Nature 352:236–238, with permission of the Nature Publishing Group.]


A General Approach for Closer Mimicking


A block diagram of the overall approach just mentioned is presented in the bottom panel of Fig. 5–1. The idea is to use better models of the normal processing, whose outputs may be fully or largely conveyed through the higher levels of neural control now available with implants.


Comparison of the top and bottom panels in Fig. 5–1 shows that in the new structure a model of nonlinear filtering is used instead of the bank of linear filters, and a model of the IHC membrane and synapse is used instead of an envelope detector and nonlinear mapping table. Note that the mapping table is not needed in the new structure, because the multiple stages of compression implemented in the models should provide the overall compression required for mapping the wide dynamic range of processor inputs onto stimulus levels appropriate for neural activation. (Some scaling may be needed, but the compression functions should be at least approximately correct.) The compression achieved in this way would be much more analogous to the way it is achieved in normal hearing.


Conditioner pulses or high carrier rates may be used if desired, to impart spontaneous-like activity in auditory neurons and stochastic independence among neurons (Rubinstein et al, 1999; Wilson et al, 1997). This can increase the dynamic range of auditory neuron responses to electrical stimuli, bringing it closer to that observed for normal hearing using acoustic stimuli. Stochastic independence among neurons also may be helpful in representing rapid temporal variations in the stimuli at each electrode, in the collected (ensemble) responses of all neurons in the excitation field (e.g., Parnas, 1996; Wilson et al, 1997).


Spontaneous activity and stochastic independence among neurons are among the attributes of normal hearing that are not reproduced using standard strategies and parameter choices for cochlear implants. Reinstating these attributes to the extent possible may be helpful.


The approach illustrated in the bottom panel of Fig. 5–1 is intended as a move in the direction of closer mimicking. It does not include feedback control from the central nervous system, and it does not include a way to stimulate fibers close to an electrode differentially, the latter of which would be required to mimic the distributions of thresholds and dynamic ranges of the multiple neurons innervating each IHC in the normal cochlea. However, it does have the potential to reproduce or approximate other important aspects of the normal processing, including: (1) details of filtering at the BM and associated structures, and (2) noninstantaneous compression and adaptation at the IHCs and their synapses.


Implementations of “Closer-Mimicking” Processors


Studies are underway in our laboratories to evaluate various implementations of processors based on the general approach outlined above. We are proceeding in steps, including: (1) substitution of a bank of dual-resonance, nonlinear (DRNL) filters (Lopez-Poveda and Meddis, 2001; Meddis et al, 2001) for the bank of linear filters used in a standard CIS processor; (2) substitution of the Meddis IHC model (Meddis, 1986, 1988) for the envelope detector and for some of the compression ordinarily provided by the nonlinear mapping table in a standard CIS processor; and (3) combinations of (1) and (2) and fine-tuning of the interstage gains and amounts of compression at various stages. Work thus far has focused on implementation and evaluation of processors using DRNL filters (step 1). For those processors, the envelope detectors and nonlinear mapping tables are retained, but the amount of compression provided by the tables is greatly reduced as substantial compression is provided by the DRNL filters. The DRNL filters have many parameters whose adjustment may affect performance. We have started with a set of parameter values designed to provide approximately uniform compressions at the most responsive frequencies (nominal “center frequencies”) of the different filters. This choice departs from the highly nonuniform compression across frequencies described in Lopez-Poveda and Meddis (2001), but corresponds to more recent findings (Lopez-Poveda et al, 2003; Williams and Bacon, 2005).


We also have begun to explore effects produced by manipulations in the parameters from the above starting point. For example, we have adjusted parameters to produce a broader tuning for each of the filters, so that their responses overlap at least to some extent across channels.


In general, the frequency responses of the DRNL filters are much sharper than those of the Butterworth filters used in standard CIS processors, at least for six to 12 channels of processing and stimulation and at least for low-to-moderate input levels. Thus, if one simply substitutes DRNL filters for the Butterworth filters without alteration, then substantial gaps will be introduced in the represented spectra of lower-level inputs to the filter bank. Such a “picket fence” effect might degrade performance, even though other aspects of DRNL processing may be beneficial.


Initial Studies with Dual-Resonance, Nonlinear Filters and “n-to-m” Approaches


Studies to date have included evaluation of DRNL-based processors with broadened filters, as noted above. In addition, we have tested n-to-m constructs, in which more than one channel of DRNL processing is assigned to each stimulus site. In one variation, the average of outputs from the multiple channels is calculated and then that average is used to determine the amplitude of a stimulus pulse for a particular electrode. Each DRNL channel includes a DRNL filter, an envelope detector, and a lookup table for compressive mapping of envelope levels onto pulse amplitudes. Thus, the average is the average of mapped amplitudes for the number of DRNL channels assigned to the electrode. We call this the “average n-to-m approach,” in which m is the maximum number of electrodes available in the implant and in which n is the total number of DRNL channels, an integer multiple of m. In another variation, the maximum among outputs from the channels for each electrode is identified and then that maximum is used to determine the amplitude of the stimulus pulse. We call this the “maximum n-to-m approach.” Both approaches are designed to retain the sharp tuning of DRNL filters using the standard (starting) parameters, while minimizing or eliminating the “picket fence” effect.


These n-to-m approaches are illustrated in Fig. 5–2. As shown, the spectral gaps or “picket fence” effect produced by assigning only one DRNL filter (or channel) to each stimulus site (top panel) is reduced or largely eliminated with the average n-to-m or maximum n-to-m approaches (middle and bottom panels).


Results from tests with seven subjects indicate that the n-to-m approaches can be helpful (Schatzer et al, 2003). In particular, use of these approaches produced significant increases in speech reception scores in some cases, compared with processors that simply assigned the output of each DRNL channel to a single corresponding stimulus site. [These subjects all used bilateral Med-El (Innsbruck, Austria) implants, with a maximum of eight or 12 stimulus sites on each side, depending on the particular implant device, either the Combi 40 with eight sites or the Combi 40+with 12 sites.] Improvements for speech reception in noise were generally larger than improvements for speech reception in quiet conditions. In a few cases where comparisons were made, the maximum n-to-m approach was better than the average n-to-m approach. The best of the DRNL processors using an n-to-m approach produced speech reception scores that were as good as, but not better than, control CIS processors using m channels (with standard Butterworth filters) and m sites of stimulation.


We regarded this as an encouraging result, an immediate matching of performance with a new processing strategy, with very little or no experience in using the new strategy. In many prior studies, we and others (e.g., Tyler et al, 1986) have found that such an initial equivalence can be followed by much better performance with the new strategy, once subjects gain some experience with the new strategy.



image

Figure 5–2 Illustration of n-to-m approaches for combining dual-resonance, nonlinear (DRNL) channel outputs. Only the DRNL filter outputs are shown here for simplicity. In actual processor implementations, effects of envelope detection and compressive mapping would be included, as described in the text. Top panel: A 1-to-1 assignment of filter outputs to 11 intracochlear electrodes. Middle panel: Average (medium lines) and maximum (thick lines) 22-to-11 approaches for combining the outputs of 22 DRNL filters (thin lines) and directing the combinations to 11 electrodes. Bottom panel: An expanded display that includes only four of the filters and the average and maximum combinations of their outputs. BM, basilar membrane; SPL, sound pressure level. (Note that the y-axis scale also is expanded in the bottom panel.)


At the same time, we recognized several possibilities for improvement in the design of processors using DRNL filters that might produce even higher levels of initial performance. Those possibilities included: (1) further adjustment and testing of the many parameter values in DRNL filters; (2) combination or selection of DRNL filter outputs, rather than the DRNL channel outputs, in designs using n-to-m approaches; and (3) using the same number of filters as stimulus sites, but with a high number of stimulus sites.


The first of these possibilities recognizes that the parametric space within and across DRNL filters is quite large. We have just begun to explore this space.


The second possibility recognizes that considerable distortions and complexities may be produced in combining or selecting signals that have been altered by a highly nonlinear mapping function, in addition to the nonlinearities of the DRNL filters. Combination or selection of the filter outputs, prior to envelope detection and (further) nonlinear processing, might be better than combination or selection following all of these operations.


The third possibility might retain the likely advantages of DRNL filters (that may result from compression and nonlinear tuning, as in normal hearing) while not discarding or distorting information as is inherent in the n-to-m approaches. The spectral gap problem would be handled through the use of a high number of stimulus sites, rather than with one of the n-to-m approaches.


Combined Use of Dual-Resonance, Nonlinear Filters and Virtual Channels


In more recent studies (Wilson et al, 2003b), we compared three basic processor designs in tests with a user of the Ineraid device (previously manufactured by Symbion, Inc., of Salt Lake City, UT, and then by Smith & Nephew Richards, Inc., of Bartlett, TN; this device is no longer manufactured), which includes a percutaneous connector and six intracochlear electrodes. The designs included a processor using 24 DRNL channels mapped to the six electrodes using a maximum 24-to-6 approach, as described above and in greater detail in Schatzer et al, 2003. The parameter choices used for the DRNL filters included a set to provide a flat frequency response across the spectrum spanned by all the filters, as also described in Schatzer et al. The spectrum was from 350 to 7000 Hz. This processor is referenced in the remainder of this chapter as the “cp CIS” processor (“cp” refers to a DRNL filter bank that is designed to provide a close replication of responses to sound at the cochlear partition). Other aspects of the processor, such as the interlacing of stimuli across electrodes, are the same as in the standard CIS strategy (this strategy is described in greater detail in Wilson et al, 1991).


The two other processor designs employed virtual channels as a way to increase the number of discriminable stimulus sites beyond the number of actual electrodes. This concept was introduced by our team in the early 1990s (Wilson et al, 1992, 1993, 1994a,b), and has since been investigated by others (Donaldson et al, 2004; Litvak et al, 2003; Poroy and Loizou, 2001). In the reports by Donaldson et al and Litvak et al, the term current steering is used instead of the term virtual channels to reference the same concept.


A series of diagrams illustrating the construction of virtual channels is presented in Fig. 5–3. With virtual channels (or current steering), adjacent electrodes may be stimulated simultaneously to shift the perceived pitch in any direction with respect to the percepts elicited with stimulation of one of the electrodes only. Results from studies with implant subjects indicate that pitch can be manipulated through various choices of simultaneous and single-electrode conditions (e.g., Wilson et al, 1993). If, for instance, the apical-most electrode of the Ineraid array (electrode 1) is stimulated alone (Fig. 5–3A), subjects have reported a low pitch. If the next electrode in the array (electrode 2) is stimulated alone (Fig. 5–3B), a higher pitch is reported. An intermediate pitch can be produced for all Ineraid subjects studied to date by stimulating the two electrodes together with identical, in-phase pulses (Fig. 5–3C). Finally, by reversing the phase of one of the simultaneous pulses, pitch percepts higher or lower than those produced by stimulation of either electrode alone can be produced. For example, a pitch lower than that elicited by stimulation of electrode 1 only can be produced by simultaneous presentation of a (generally smaller) pulse of opposite polarity at electrode 2 (Fig. 5–3D). The availability of pitches other than those elicited with stimulation of single electrodes only may provide additional discriminable sites along (and beyond) the length of the electrode array. Such additional sites may support additional, perceptually separable, channels of stimulation and reception. We call these additional channels “virtual channels,” and processors that use them virtual channel interleaved sampling (VCIS) processors.


The two additional processor designs included in the comparisons of the present studies used a VCIS approach to provide 21 discriminable sites of stimulation with Ineraid (Cochlear Corporation, Englewood, CO) subject SR3’s array of six intracochlear electrodes. The approach is illustrated in Fig. 5–4, in which stimulus site 1 is produced by stimulation of electrode 1 only, stimulus site 2 by simultaneous stimulation of electrodes 1 and 2 with a pulse amplitude of 75% for electrode 1 and of 25% for electrode 2, and so on. Results from pitch-ranking tests, using a two-alternative, forced choice (2AFC) procedure, indicated that each of the 21 sites thus formed produced a distinct pitch for SR3, that is, a pitch that is significantly different from those produced by stimulation of the neighboring site(s).


Aug 27, 2016 | Posted by in OTOLARYNGOLOGY | Comments Off on Possibilities for a Closer Mimicking of Normal Auditory Functions with Cochlear Implants

Full access? Get Clinical Tree

Get Clinical Tree app for offline access