14
Speech Production by People with Cochlear Implants
Steven B. Chin and Mario A. Svirsky
The study of speech production by cochlear implant (CI) users is important on both clinical and theoretical grounds. It is important clinically because, although CIs are by design primarily aids to the perception of speech, they are also an important aid in the development of speech production and oral language in children with congenital or prelingual deafness. Theoretically, however, cochlear implantation also sheds light on the intricate relationship between speech perception and speech production in mature language systems.1 This chapter discusses studies of pediatric and adult CI users separately. For children, we discuss research on intelligibility, on phonology, and on acoustic and physiologic characteristics of speech. Postlingually deafened adults typically have high intelligibility and intact phonological systems, so we discuss mostly acoustic and physiologic studies in adults.
♦ Speech Production in Children with Cochlear Implants
Clinical trials in children of the Nucleus 22-channel implant (Cochlear Corporation, Sydney, Australia) were initiated in 1986, and this device was approved for use in children by the U.S. Food and Drug Administration in 1990. Most research on speech production in children began during the clinical trial period. To address issues of device efficacy, most studies were comparisons of speech production either at different intervals (e.g., before implantation and after implantation) or in different clinical populations (e.g., CI users and hearing aid users). As the efficacy of cochlear implantation became established, researchers undertook comparisons of speech production in subpopulations of CI users (e.g., oral communication users and total communication users). Most recently, increased expectations of cochlear implantation benefits have given rise to comparisons of speech production in children with CIs and children with normal hearing.
Intelligibility
Intelligibility refers to the recoverability of a speaker’s linguistic message, differing from articulatory or phonological measures in that some aspect of meaning is involved. In cochlear implantation research, units of intelligibility range in size from morphemes to whole sentences. Intelligibility is most often measured with rating scales or write-down (transcription) procedures. Most studies of young CI users have employed write-down procedures, which are considered to have more face validity than rating tasks2 and less sensitivity to vocal qualities, which may contaminate rating scale responses.3 Materials for transcription procedures include those by McGarr,4 Monsen,5 and Osberger et al6; rating scales include the one in Allen et al.7
Several studies have compared the speech intelligibility of profoundly deaf children before and after implantation. Tobey et al8 examined the speech intelligibility of children with CIs, using sentences developed by McGarr.4 Recordings were transcribed by graduate students; scoring reflected the number of keywords correctly transcribed. Results indicated that speech intelligibility was significantly higher after implantation than before. Similar results were reported by Tobey and Hasenstab9 and Dawson et al.10 Similar to the before/after comparisons, Mondain et al11 examined the effect of increased device use on intelligibility in 16 French children. Mean percent correct scores did increase as length of device use increased.
Osberger et compared speech intelligibility in pediatric users of single-channel CIs, multichannel CIs, and tactile aids, with users of hearing aids serving as controls. Materials were sentences from Monsen5 or similar sentences. Children’s productions were transcribed by naive listeners. Children with early-onset deafness (before age 4) who received a CI before age 10 had the highest intelligibility scores, whereas children who did not receive a CI until after age 10 had the lowest scores. Osberger et al13 reported that the intelligibility of CI users began to exceed that of hearing aid users with thresholds at 100 to 110 dB hearing level (HL) after 2.5 years of device use. Studies such as Osberger et al12 included participants using older strategies such as Multipeak (MPEAK). A later study by Svirsky et al,14 in which all participants used either the Spectral Peak (SPEAK) or continuous interleaved sampling (CIS) strategy, showed that after 1.5 to 2.5 years of implant use, the speech intelligibility of CI users was similar to that of hearing aid users with pure tone averages (PTAs) of 90 to 100 dB HL. Chin et al15 examined speech intelligibility in children with CIs and children with normal hearing. Children with normal hearing achieved ceiling levels around the age of 4 years, but a similar peak was not observed for the children with CIs, who were significantly less intelligible than children with normal hearing when controlling both for chronological age and length of auditory experience. Studies examining relationships between overall intelligibility and other communicative skills include O’Donoghue et al,16 who assessed speech perception and intelligibility using rating scales. Results indicated strong correlations between intelligibility at 5 years after implantation and earlier speech perception, indicating that speech intelligibility might be predictable by measures of earlier speech perception. Svirsky17 also found significant and positive correlations between intelligibility and speech perception. Chin et al18 examined relationships among intelligibility, contrast production, and contrast perception in 20 children with CIs. There were significant overall correlations, but individual feature scores for contrast perception were not correlated with the corresponding production features, and only some perception and production feature scores were correlated with overall intelligibility.
Phonology
Although overall speech intelligibility has high face validity as a measure of communicative ability, much of the research on speech production in children who use CIs has examined such phonological properties as consonants, vowels, and suprasegmentals. Several studies have examined the effects of both cochlear implantation itself and continued use of a CI on phonological properties. Kirk and Hill-Brown19 appeared just 5 years after commencement of clinical trials of the House/3M (St. Paul, MN) single-channel implant in children. This work examined both segmental (e.g., consonants) and nonsegmental (e.g., vocal duration) properties in both imitative and spontaneous speech production (based on Ling20). Studies reported by Tobey et al8 and Tobey and Hasenstab9 also used evaluation procedures from Ling. Results from all three studies showed general improvement trends from before implantation to after implantation. Tobey et al21 examined production of place, voicing, and manner distinctions as a function of age, with results indicating a significant effect of age on the production of all manner categories.
Studies comparing the effects of different sensory aids on phonological characteristics include that of Tobey et al,22 which compared speech in users of CIs, hearing aids, and tactile aids. Kirk et al23 compared feature production in consonant vowel (CV) syllables in CI users and hearing aid users, Ertmer et al24 examined longitudinal changes in imitative vowel and diphthong production in CI users and tactile aid users, and Sehgal et al25 examined imitative consonant feature production in CI users and tactile aid users. These studies tended to show greater benefits of CIs over other sensory aids, except hearing aids worn by users with the most residual hearing.
Case studies examining phonological characteristics include Chin and Pisoni,26 which examined consonant and vowel inventories, syllable structure and phonotactic constraints, and sound correspondences in one child at ˜2 years after receiving a CI. Consonant production data from a single child were reported in Ertmer and Mellon,27 and vowel data from the same child were reported in Ertmer.28 In addition to individual children, researchers have also focused on individual aspects of phonology. A common one is the inventory of sound segments. The development of consonant and vowel inventories for children with CIs in English-speaking environments was examined by Blamey et al,29 Serry and Blamey,30 and Serry et al.31 Chin32 examined consonant inventories in children who had used CIs for at least 5 years, comparing inventories of oral communication users with those of total communication users. Dillon et al33 examined consonant productions in a nonword imitation task. Peng et al34 examined the inventories of syllable-initial consonants in Mandarin-speaking children with CIs, finding relatively low mastery levels for these children. Consonant clusters were examined as early as Kirk and Hill-Brown19 in children with single-channel implants; later investigations include those by Chin and Finnegan35 and Chin.36 Preliminary work on intonation was reported in O’Halpin,37 and Carter et al38 reported results concerning stress placement and number of syllables on a nonword imitation task. Two studies on tone production, one with children in the People’s Republic of China39 and the other in Taiwan,40 have both indicated deficits in this aspect of phonology for most of the children studied.
Acoustics and Physiology
Researchers have also studied the acoustic and physiologic characteristics of the speech of children who use CIs. Voice onset time (VOT) was examined in Fourakis et al41 and Tobey et al.42 Vowel formants (particularly F2) were investigated in Murchison and Tobey,43 Svirsky and Tobey,44 Tobey et al,8 Economou et al,45 Tobey,1 and Ertmer.28 Svirsky et al46 examined oral-nasal balance, and Higgins et al47 and Jones et al48 investigated intraoral pressure.
Recent studies have examined multiple acoustic and physiologic parameters. Uchanski and Geers49 compared VOT, F2 frequency, spectral moments, nasality, and durations in children with CIs and children with normal hearing. For most of the implant users, most acoustic characteristics had values within the range for children with normal hearing. Higgins et al50 examined intraoral air pressure, phonatory air flow, electroglottagraph cycle width, fundamental frequency, and intensity, and Higgins et al51 examined jaw opening, F1, F2, nasal air flow, voice onset time, voicing duration, and intraoral air pressure. Higgins et al52 examined intraoral air pressure, nasal and phonatory air flow, voice onset time, and fundamental frequency in children with CIs, both longitudinally and in comparison to children with normal hearing.
♦ Speech Production in Adults with Cochlear Implants
In this section we discuss the changes in speech production in postlingually deafened adults after receiving a CI. These changes are more subtle that those observed in prelingually deaf children, who must rely on the auditory information provided by the implant while they learn the sounds and phonological system of their language. Typically, adults who become profoundly to totally deaf after acquiring language do not show major deterioration in their ability to produce speech sounds or to speak intelligibly,53 suggesting that the role of hearing in speech production is more limited in adults than in children. In fact, from the mid-1980s to the early 1990s there was spirited discussion in the literature about whether adventitious deafness caused disordered speech at all. Goehl and Kaufman54 examined the speech of five adventitiously deafened adults and concluded that, in spite of the “popular clinical prediction,” speech does not deteriorate as a consequence of adventitious deafness. They concluded that “routine recommendations for speech conservation [in postlingually deafened adults] are probably unwarranted.” However, this study was forcefully questioned by Zimmermann and Collins,55 who said it suffered from “logical and methodological flaws” and that following its recommendations “may have adverse clinical effects.” The Zimmermann and Collins letter even questioned the editorial policies of the journal that published the article, eliciting a reply from the editor explaining the peer-review procedures of the Journal of Speech and Hearing Disorders. Cowie et al56 also questioned the Goehl and Kaufmann study, pointing to evidence that adventitious deafness “does sometimes affect speech, and the effects may be of more than theoretical significance.” Years later, a study by Leder and Spitzer57 found several abnormalities in the speech of adventitiously deaf subjects. Goehl58 acknowledged that Leder and Spitzer’s study as well as Lane and Webster’s59 showed reliable differences between the speech of the adventitiously deaf and that of the normal hearing, but said that these differences did not rise to the level of “clinically significant disorders.” Leder and Spitzer’s forceful counter-reply ended by saying that Goehl et al’s “inaccurate and unfounded conclusions cannot be left unchallenged or accepted in the literature.” Once again, the polemic included commentary from the journal’s editorial board (in this case, Ear and Hearing), speculating about possible reasons for the strong disagreement.
The study of the influence of hearing on adult speech production is of interest not only for the clinically related reasons discussed above, but also for theoretical reasons. The effect of prolonged postlingual deafness and that of restored hearing on speech production may provide important information to constrain theories of motor control for speech production. This topic cannot be easily investigated with animal models because speech production is a uniquely human capability (although studies of birdsong may provide important insights). From a basic scientific point of view, postlingually deafened adult users of CIs provide an interesting paradigm for exploring the role of hearing in adult speech production. An excellent example of a careful and comprehensive theory of speech motor control, based in part on data from postlingually deaf CI users, can be found in Perkell et al.60 The goal of the following section is more modest: we discuss some of the literature on changes in the acoustics and physiology of adult speech production that are associated with the use of a CI. In addition to their theoretical interest, these studies help determine the effectiveness of CIs in adults, as it relates to their speech production.
Acoustics
The role of hearing as an input for the neural mechanisms that control speech production remains controversial. Results obtained with normal-hearing listeners led some researchers to propose that speech production may be controlled by auditory feedback.61 This hypothesis received indirect support from early studies with the Lombard effect, an increase in vocal effort in the presence of background noise (see Lane and Tranel62 for a review), and from delayed auditory feedback, in which speakers become disfluent when they hear their own speech delayed by ˜200 milliseconds.61 However, later studies argued against an active role for audition in the moment-to-moment control of speech production, at least at the level of individual phonemes. For example, Borden63 argued that auditory information about many English phonemes is received too late for the central nervous system to be able to correct ongoing phonemic speech gestures. Most investigators have proposed that auditory feedback serves to calibrate other systems that control speech on a moment-to-moment basis.64,65
Interest in this issue motivated several studies of the acoustics of speech production in postlingually deafened adults with CIs. Perkell et al66 studied vowel production in four recipients of the Ineraid CI (Richards Medical Co., Memphis, TN) before implantation and at regular intervals after implantation. The measured parameters included F1, F2, F0, sound pressure level (SPL), duration and “harmonic difference,” a correlate of voice breathiness. Overall trends toward normative values in several parameters were found, but this result was not universal and it was complicated by lack of perceptual benefit in one subject, and by earlier deafening in two other subjects. An important result in this study (which parallels the study of nasalance by Svirsky et al46,67; see above) is that because speakers respond differently to deafening, their responses to processor activation also differ. Subjects with parameter values that exceed the upper normative boundary may show a decrease after processor activation; subjects with parameters below the lower normative boundary may show an increase after processor activation. Another important result relates to the interactions among measured parameters. The authors studied not only the longitudinal changes in each parameter, but also the correlations between different parameters for each subject. A large degree of interdependency was found among different parameters, some of which may have been due to mechanical interactions among different articulatory adjustments. For example, an increase in speaking rate results in shorter phoneme duration, and the shorter duration may be associated with formant frequency changes because the speaker may not have enough time to reach a steady-state target.