and other videos, which, they reported, were of considerable help. Finally, almost half (33) reported they had received no training at all. Several of the respondents expressed concern that they had not received any specific listening training even though they were experiencing difficulties, or were not satisfied with their performance level.
In part, the lack of training is due to the great success that many CI users have with this bare-bones approach. Dorman and Wilson (2004), for example, reported average scores for adult CI users of 80% to 100% for sentences and 45% to 55% for words in isolation. There are, however, some adult CI users who are performing poorly, and the needs of this group cannot be dismissed. For example, Lenarz et al. (2012), in a study of more than 1,000 adult CI users, found that 13% of the group were unable to identify more than 10% of words in sentences. Unfortunately, as Moberly et al. (2016) note, the response provided by many clinicians is “restricted to reassurance and recommending that the patient ‘keep working at it.’ In some settings, a struggling CI patient may be referred to a speech-language pathologist who focuses on aural rehabilitation, although this strategy is limited by a lack of evidence-based methodologies or support from insurance providers” (p. 1523).
It can be seen that despite overall high performance levels, many experienced CI users express dissatisfaction with the current situation. In discussions with CI users in the U.S., the U.K., and Australia, and in the recent Facebook “survey,” I have found that many believe their performance could have been considerably enhanced if they had received systematic speech perception training in the period immediately following the activation of their CI’s. Perhaps surprisingly, those expressing dissatisfaction with the current approach include many CI users with very high performance levels. They express the belief that their progress would have been greatly accelerated if they had been provided with ongoing support and training. They also believe there are many other CI users who continue to struggle because they have not been provided with appropriate intervention.
This chapter is concerned with approaches to speech communication training that I have used with a large number of adults after they have received a CI or CI’s. The training was provided one on one, and the clients were attending the Hearing Rehabilitation Foundation (HRF) in Somerville (1996–2017) or Woburn (2018), MA. This experience provided an excellent opportunity to provide and evaluate the effectiveness of long-term training for adults with CIs. As part of my work with the HRF, I have also been able to trial these approaches with clients with severe to profound hearing loss who have been fit with hearing aids (HAs). The chapter will provide an overview of the materials and approaches that I use in speech communication training, and present case studies of two adult clients.
Historical Background
The first systematic attempts to alleviate the effects of an acquired hearing loss on speech perception date from pioneering lipreading (speechreading) teachers in the U.S. and Europe around the turn of the 20th century (see Jeffers & Barley, 1971 for an excellent overview). Effective hearing aids were still several decades away, and the approaches developed focused on the use of vision as a substitute for hearing. The visual signal is, however, notoriously ambiguous, and offers limited access to the speech signal. To extract meaning from this limited sensory signal, lipreaders need to take “advantage of linguistic redundancy—the constraints imposed by the rules and function of language—. . . to fill in the gaps left by the visual stimulus” (Boothroyd, 1988, p. 78). Nitchie (1912) recognized the importance of training his students to take advantage of these synthetic processes, but he did not achieve his goal of developing effective materials and approaches before his premature death in 1916.
Two significant events in the 1940s contributed greatly to the development of speech communication training. The first was the introduction of the one-piece hearing aid in 1944 (Dillon, 2001), resulting in the first truly wearable hearing aids. The second was the “birth” of audiology in the military rehabilitation centers set up in the U.S. to provide assistance to service personnel who had developed hearing losses due to war service. These centers provided up to 3 months of direct intervention, designed to help overcome the effects of an acquired hearing loss (Schow & Nerbonne, 1996). Although these rehabilitation centers were closed soon after the end of World War II, this intensive approach to rehabilitation continued at other facilities such as the Walter Reed Hospital in Washington, D.C.
Ross (2001) has provided an account of the 2-month course at Walter Reed Hospital that he attended in 1951, following diagnosis of a hearing loss. The program included hearing aid fitting and testing, lipreading instruction, and auditory training, as well as information sessions on various aspects of hearing loss. Informal exchanges between patients also contributed greatly to an awakening understanding of how best to cope with their hearing losses. Ross obviously believes that this was an extremely worthwhile experience, and 3 years later, prior to discharge from the Air Force, asked for and was granted a second 2-month course. He likens this second course to a “booster shot,” and believes that it helped him to “truly come to terms with the fact that [he] had a permanent hearing loss” (p. 63).
Ross recognizes that this “Camelot” period in aural rehabilitation can never return. It would be far too costly to replicate this service, much of which, he admits, “was overkill” (p. 66) but he feels that we can learn a great deal from it. In summary, he calls for an approach that focuses on the individual person with hearing loss, and her/his needs, and recognition that technology, the CI and/or HA, is “only one of the tools that can be used to reduce the communicative and handicap impact of a hearing loss” (p. 67).
Over the following half-century or so, there has been a decline in the amount of audiologic rehabilitation, particularly auditory training, provided to adults with an acquired hearing loss. In part, this has been due to advances in technology, leading to clinicians having “the unsubstantiated view that modern technology is sufficient to negate the need for additional auditory training” (Sweetow & Henderson Sabes, 2009, p. 273). Sweetow and Henderson Sabes (2009) also point out that clinicians may feel that there is little evidence to support the provision of such services, and they recognize that valuable professional time spent providing such training is unlikely to be reimbursed. Sweetow and Henderson Sabes believe auditory training does provide benefit for people with hearing loss, but there is a need to gather more objective and subjective data that supports this view.
Auditory Training
Sweetow and Henderson Sabes (2009) define auditory training as “a process designed to enhance the ability to interpret auditory experiences by maximally utilizing residual hearing” (p. 267). This definition, although accurate for persons fit with HAs, needs to be modified slightly to include the reality of CI hearing. Although there is now recognition of the need to preserve the residual hearing of CI users, the electrical signal provided by CIs directly stimulates the auditory nerve. A slight change in wording to include “residual (acoustic) and electrical hearing” would more accurately reflect the current situation.
The terms analytic training and synthetic training are used to describe the two basic approaches to auditory training.
Analytic training involves bottom-up processes and focuses on the context-free acoustic-phonetic signal. Examples include contrasting words with differing syllabic structures (cat versus apple versus football), different vowels (hard versus heed versus who’d), initial consonant differences (pot versus tot versus cot), etc. Such tasks are usually presented using a closed-set format. The listener’s task is to listen for the salient acoustic cues and indicate which of the contrasted words was presented as the stimulus.
Synthetic training involves top-down processes, where the listener uses her/his linguistic knowledge to fill in the gaps in the sensory information provided by her/his CI(s) and/or HA(s). Boothroyd (1988), in a discussion of linguistic factors in lipreading, lists six sources of linguistic redundancy—pragmatic, topical, semantic, syntactic, lexical, and phonological—that can be used to assist in deriving meaning and understanding.
Analytic or Synthetic?
Some researchers, such as Rubinstein and Boothroyd (1987), debate the value of analytic training and suggest that the primary focus of training should be on synthetic training exercises, which concentrate on sentence level materials, and attempt to exploit the “uses of linguistic and situational redundancy” (Rubenstein & Boothroyd, 1987, p. 152). Native speakers of a language bring a vast store of knowledge to any communication situation. They know, at least implicitly, the rules of their language—how words are formed (phonological rules), how sentences are put together (syntactic rules), which words tend to occur in close proximity to each other (collocational rules), and so forth—and it’s this knowledge that enables us to fill in the gaps and understand speech even when we can only partially hear what is being said.
Others, such as Sweetow and Henderson Sabes, argue for an approach that combines both synthetic and analytic training. Support for the inclusion of analytic training can be found in the work of Walden, Erdman, Montgomery, Schwartz, and Prosek (1981), who provided a group of adult subjects with 7 hours of training in the auditory recognition of nonsense syllables. They found that the subjects’ post-training scores for an auditory-visual sentence test presented in noise were significantly better (p < 0.01) than those obtained before training.
I believe training should reflect, as much as possible, the realities of everyday communication, and advocate for an approach that has synthetic training as its primary focus, but does include some analytic training. This has been motivated by the recognition that there are many occasions where listeners are forced to rely exclusively on acoustic/phonetic (analytic) cues to derive understanding. Consider, for example, the possible range of responses to the question, “What is your last name?” The U.S. Census Bureau includes 889,799 last names in its database, ranging from the common, and, perhaps predictable, “Smith,” “Jones,” “Williams,” and” Garcia” to the more obscure and far less predictable “Argenti,” Barnak,” and “Muckenfuss.” In all of these cases, the listener must use acoustic/phonetic information to determine a person’s name.
Speech understanding using analytic (bottom-up) processes presents problems for almost all people with hearing loss. Adult EARS (Plant, Ng, & Amann, 2011) included a sentence test, which used the Speech Perception in Noise (SPIN) Test format developed by Kalikow, Stevens, and Elliott (1977). Fifty sentences were presented, with the listener’s task being to repeat the last word in the sentence. In half of the sentences the identity of the last word can be predicted to a large extent, based on the rest of the sentence. Examples of these high-predictability sentences include, “I asked my boss for a raise” and “I heard someone knocking at the door.” In the remaining half of the sentences, the identity of the final word cannot be predicted using contextual cues. Examples of these low-predictability sentences are “We need to talk about the vase” and “Do you think she said, ‘Stop’?” Two lists of these sentences were presented to 20 experienced adult CI users over a 3-month period. The mean score for the high-predictability (HP) sentences was 90.6% correct, while that obtained for the low-predictability (LP) items was 72.3%. The results of the nonparametric Wilcoxon signed-rank test showed that the difference between the HP and LP scores was significant (p < 0.001).
Warm-Up Exercises
One of the most important parts of any training session occurs at the beginning of the session. Many clients, at least at first, approach training with some degree of trepidation, and need some short exercises to set the scene for the remainder of the session. In some ways, the beginning of each training session is the warm-up time, where the client gets ready for the rest of the session by listening attentively and responding as quickly as possible.
At this time, I like to use simple closed-set materials, which start at a relatively easy level and then, if appropriate, become progressively more difficult. One exercise that I have found particularly useful is presenting numbers for identification, at first in isolation and then in progressively longer strings. For example, I will tell the client that s/he will hear the numbers 0 to 9 presented in a random order. Her/his task is to repeat what s/he hears. I usually present the isolated number in the carrier phrase, “The number is . . . ,” but occasionally vary the procedure and embeds the target in a short sentence such as, “I saw . . . cars,” “I have . . . brothers,” etc. If the client is able to identify the numbers with few errors, I then instruct her/him that I am now going to present two numbers at a time. If the client is able to perform this task with little difficulty, I extend the number of items to three, four, five, or six number strings.
Miller (1956), in a study of digit recall, found that the limit of short-term memory was seven plus or minus two. Later studies (for example, Baddeley, Thomson, & Buchanan, 1975) have found that while this result appears to hold for college-age students, it varies for other populations. A study by Souza, Boike, Witherall, and Tremblay (2007) of three groups of listeners with moderate to severe hearing losses aged 50 to 65 years, 67 to 75 years, and 77 to 82 years, however, found no significant differences in the auditory digit recall of the three groups, although there was a slight decline for the older listeners. The mean scores for the three groups were 7.5, 6.8, and 6.7, respectively. The mean score for a group of younger listeners with normal hearing (mean age = 23.5 years) was 7.0. As a result of these studies, I rarely present number strings of more than six items.
If a client is able to perform this task easily, I ask her/him to repeat the string of numbers backwards. For example, if the numbers presented are 5 7 3 4, the client’s response should be 4 3 7 5. I can also ask clients to put the numbers in a string in their numerical order. For example, the correct response for the string 5 7 3 4 would be 3 4 5 7. Such exercises also provide insights into the working memory capacities of individual clients.
I can also vary the items presented for identification. For example, I will sometimes present strings of color names, animals, or the names of U.S. presidents. I provide the client with a sheet showing the set of alternatives, and repeat the procedure described above: items presented in isolation, and then in increasing number, with the client asked to repeat what s/he heard forwards, and, if appropriate backwards.
Another approach that I sometimes use as a warm-up activity is based on the test procedure developed by Hagerman (1982). The client is shown a set of words arranged in a matrix. An example is presented in Table 24–1. I tell the client that I will select one word from each column to form a short sentence such as “Joan took five blue taps,” “Jan had two new caps,” etc. At first, I present the sentences using Clear Speech with a short pause between each word. If the client is able to perform this task easily, I start to use a more normal speaking style, and if the client is still able to perform at a high level, I will use a rapid speaking style. This can result in the client’s performance declining rapidly as the word boundaries become blurred, and many clients have commented that this is what happens to them in real-life conversations. Another modification I sometimes use is to present sequences from the matrix in the presence of background noise. This is particularly useful with clients who are not confident in their ability to cope with competing noise. The use of closed-set material provides them with a degree of security that encourages them to attempt the task.
Analytic Training
In my work with adult CI and/or HA users, I try to include some analytic training exercises at the beginning of each session. I recognize, however, that analytic training can be quite stressful for clients, and try to ensure the training targets are at an appropriate level. Clients rarely complain that any analytic materials are too easy, and I feel it is best to err on the side of caution in determining a starting point for training. This type of training requires considerable effort on the part of the client and, if not presented appropriately, can be boring. As a result, I try to ensure that I don’t use analytic training exercises for more than 10 to 15 minutes per session.
Table 24–1. An Example of a Simple Sentence Matrix Used in Training
Note: One word is chosen from each column to form a short sentence.
The following example presents the sequence I would use to develop and consolidate a client’s ability to reliably identify the consonant [s]. I start with a word list that contrasts word pairs that differ only in that one word contains an initial [s] and the other does not. Examples of this type of contrast include sit/it, say/A, and sigh/eye. The client is presented with each contrast in turn and asked to identify which word was presented as the contrast. I usually present the items using a short carrier phrase such as, “Number 1 is . . . ,” “Number 2 is . . . ,” and so on.
If the client is able to perform this task in both the initial and final (peace/pea, cat/cats) position, I then move up to contrast the target with consonants that differ considerably in their acoustic structure. Examples include word pairs contrasting [s] with nasals (my/sigh, mass/man), semivowels (way/say, right/sight), or voiced stops (bend/send, Rob/Ross). If the client is able to complete those items successfully, potentially more difficult contrasts could then be attempted such as word lists that contrast [s] with voiceless stops (pet/set, miss/mitt), the affricates (jet/set, match/mass), or nonsibilant consonants (fine/sign, myth/miss). Only after passing through a sequence such as that outlined above would I attempt contrasting [s] with the voiceless sibilant [sh] in word pairs such as sigh/shy, see/she, gas/gash, and mass/mash or its voiced cognate [z] in word pairs such as zoo/Sue, zap/sap, and pays/pace.
My primary source for analytic training materials is Analytika (Plant, 2004), which is a collection of almost 600 word lists presenting consonant or vowel contrasts in meaningful words. I have also developed some materials that present the contrasts in closed-set sentences.
Examples of this approach include:
“Are you sure that she SIPPED it?”
“Are you sure that she SHIPPED it?”
“Where did you get these new SEATS?”
“Where did you get these new SHEETS?”
These materials are presented using a closed-set format, with the client asked to identify which of the contrast words was presented.
I show the client each contrast printed on a sheet of paper, or as a PowerPoint slide presented via a computer monitor. I ask her/him to tell me which of the words was presented, and provide immediate feedback as to the correctness of her/his response. If an error is made, I provide feedback such as, “No, it was ___, not ___.” Wherever possible, I present lists of 20 contrast pairs. A score of 15/20 is significantly above chance at the 5% level, and I use this as the “pass/fail” criterion for each list.
If a client scores less than 15/20, I attempt some limited training that focuses on the contrasts in a variety of word and/or nonsense syllable pairs. If the client still experiences difficulty, I will often represent the list with both auditory and visual cues, to see if access to lipreading results in improved performance. My aim is to provide the client with an awareness of her/his “problem” areas, and then work on solutions to the problems. These can, of course, include training in the perception of the contrast, but can also involve developing an awareness of which contrasts create difficulty, so that the client can be aware of potential problems in real life and act accordingly.
I’ll give a personal example of such a compensatory strategy. As a result of a moderate hearing loss, I have a degree of difficulty in reliably hearing the difference between the numbers “two” and “three” in situations where I do not have access to lipreading cues. I’m aware of this problem, and try to ensure that I reduce the chance of making an error by asking the speaker if the number was “two,” thus eliciting a “yes/no” response, or, in the case of phone numbers, for example, repeating the number for verification.
Another approach that is often used in auditory training programs is the presentation of consonants in a vowel-consonant-vowel syllable such as [aCa] or [iCi]. I used this procedure for many years, but stopped when I realized I was having difficulty reliably differentiating between the clients’ responses for consonants such as [p] and [t], and [m] and [n]. I felt, however, that the exercise provided valuable insights into the sensory information available to individual CI users, and eventually developed an alternative response that enabled me to easily differentiate between the items. The system uses a modification of the NATO phonetic alphabet used by emergency services personnel when spelling out a word. The alphabet uses distinct words for each letter of the alphabet: bravo for “b,” hotel for “h,” etc. I provide clients with a chart listing the word to be used for each of the 20 consonants (p, t, k, b, d, g, f, v, s, z, sh, h, ch, j, m, n, w, r, l, y) presented in testing. The modifications made to the existing NATO alphabet are Charlie for the affricate [ch], kangaroo for the stop [k], and shadow for the fricative [sh].
The task is presented using the Kungliga Tekniska Högskolan (KTH) Tracking Procedure (Gnosspelius & Spens, 1992), which is described in detail later in this chapter. The syllables are presented for two minutes, and immediate feedback is provided if the client’s response is incorrect: “No, it was [repeat syllable].” If the client does not recognize the syllable correctly after three repeats, s/he is shown the correct response on a computer monitor. After 2 minutes, the program calculates the number of syllables correctly repeated per minute. The number of syllables correctly perceived is also recorded, but the syllable rate provides insights into how quickly the client is able to recognize the test items.
Useful insights into a client’s performance can also be obtained by asking them to provide subjective ratings related to a particular task or contrast. I often ask clients to rate the amount of effort that a task requires, using a 7-point scale ranging from 1 (extreme amount of effort) to 7 (no effort at all). I also ask clients to give an estimate of how confident they are that their response is correct. Again, I use a 7-point scale with the rating items ranging from no confidence at all (1) to completely confident (7). Such rating scales can be used to track the amount of work required to complete a task, and this can be measured over time.
Synthetic Training
Synthetic training provides clients with practice using materials that more closely replicate everyday communication. Sweetow and Henderson Sabes (2009) believe synthetic training provides opportunities for the listener “to use various communication strategies, including attention, use of contextual cues, repair strategies, knowledge about linguistics and communication, and the redundancies therein to communicate effectively” (p. 269).
In presenting synthetic exercises, I try to tailor the materials to meet the individual client’s specific needs and skills, and I have developed a number of materials to help achieve this goal.
One Question and Answer exercise in Auditrain (Plant, 2001a) has “Denmark” as its theme, and I’ll use it as an example of the approach. Question 1 in this exercise is, “Where is Denmark?” Depending upon the skills of the client, the clinician might respond with a short answer such as, “Denmark is in the north of Europe” or with a more complex response such as, “Denmark is located in the area of northern Europe known as Scandinavia. Its neighbors include Germany to the south, and Sweden and Norway to the north.” Similarly, the clinician’s response to the question “Is Denmark a large country?” might range from, “No, it’s a very small country” to “No, Denmark is not a large country. It’s one of the smallest countries in Europe. It has an area of only 27,280 square miles.” The client is not expected to repeat every word of the answer, but rather to provide a brief summary of its salient points.
Because s/he asked the question, the client has some idea of the form of the response, and this helps reduce the complexity of the listening task. Clinicians can present the information in relatively long sentences, or in several short chunks, depending upon the listening skills of the client. This “expansion” technique is particularly useful, and will be covered in detail later in this chapter.
The Question and Answer topics often provide opportunities for conversational practice. Clients often show particular interest in a specific topic or answer, and this can be used as the basis for a more extended conversation. Over the years, I have built up a large stock of materials such as maps, photographs, and other materials to use in such circumstances. These can be used to provide more information and to heighten the interest level of the training session. Pictures and maps also provide contextual information that can help reduce the level of difficulty of a particular task.
Topic-Centered Sentences consist of lists of sentences based around themes such as “Things people say about the weather” and “Things people say about food” (Plant, 2001a). Each list contains a series of sentences that are related to a particular theme. For example, sentences for “Things people say about clothes” include:
■ “That’s a nice dress.”
■ “He always wears a suit to work.”
■ “What size do you think would be best for their new baby?”
■ “Their new catalogue has a lot of nice clothes, but they are far too expensive for me.”
Note that the sentence length in this selection ranges from four to 17 words. Many people with hearing loss report they have difficulty with longer segments, so it may be necessary to adjust the style of presentation to meet the needs of individual client. For example, it may be necessary to present a longer sentence in two shorter chunks such as, “Their new catalogue has a lot of nice clothes” and “but they are far too expensive for me.” Another way to help clients become more confident when confronted by longer sentences, is to take a short sequence that has been correctly repeated, and gradually increase its length through the addition of words and phrases. For example, the sentence, “That’s a nice dress,” could be extended in the following way.
■ “That’s a nice dress.”
■ “That’s a nice dress. Is it new?”
■ “That’s a nice dress. Is it new? I haven’t seen it before.”
■ “That’s a nice dress. Is it new? I haven’t seen you wear it before.”
This expansion technique is also used in Plant (2000) with short segments being made progressively longer, and clients are asked to listen for additions to recognized sentences. I have found that clients respond very positively to this technique, and are often surprised by their ability to perform the task.
Sentence lists consisting of the most frequently occurring words in spoken American English. Several years ago (Plant, 2001b), I developed a number of resources based on the 500 most frequently occurring words in spoken American English (Dahl, 1979). These words form more than 80% of the words used in everyday conversation, with the 10 most-frequently-occurring words—I, and, the, to, that, you, it, of, a, know—representing around 25% of the total number of words used. I believe these 500 words represent the “core items” of spoken English, and I have developed a large number of word and sentence lists based on these items. Although it may sound constraining, it is relatively easy to generate sentences using only these 500 words. Examples include:
■ “It’s too late now!”
■ “Do you know anything about them?”
■ “I’m afraid I don’t know the answer to your question.”
Each list (Plant 2001b) consists of 25 sentences, with a total of 200 words in each list. Average sentence length is eight words, but there are several short (four to five words) and long (12–14 words) sentences in each list. Unlike the topic centered sentences discussed above, these sentence lists are unrelated, and have no common theme. As a result, they can be a little more difficult for some listeners. I usually present one list per session, and encourage the client to repeat as many words as possible.
I have recently started to present these sentences using the KTH Tracking Procedure (Gnosspelius & Spens, 1992) to determine how quickly clients are able to recognize and repeat these items. I have also introduced a new set of sentences, all of which are 14 or more words in length. Clients often report difficulty maintaining focus with longer sentences, and this approach provides practice with these more difficult materials. The duration of the task is 3 minutes, and the client has to correctly repeat all words in the sentence before moving on to the next. If the client fails to identify a word after three repeats, s/he is shown the word on a monitor. At the end of the time period, a tracking rate (TR) is automatically calculated in words per minute. This approach provides additional information on a client’s speech recognition skills. For example, some clients are able to correctly repeat the words in a sentence, but their response is drawn out and requires a great degree of cognitive effort. Thus, although the client is able to score quite highly for the number of words repeated, the time taken to achieve this indicates that s/he would have difficulty following continuous speech, and would soon start to falter. This is especially obvious when the sentences are longer. Some clients seem to “turn off” at some point in a longer utterance, and this training encourages them to maintain focus on what is being said.
Promoting conversational skills should be the primary aim of speech communication training. I try to provide opportunities for two-way conversations in all training sessions. This can be difficult if the client is uncertain about her/his skills, or if s/he has developed a pattern of communication where s/he dominates the conversation and greatly reduces the contributions of others. I try to set aside at least a little time during a training session to have a chat with the client. The topic can be about what’s happened since our last meeting, either personally or in the wider community, or might be stimulated by the topic of one of the exercises. I travel a great deal as part of my work, and I always try to keep a photographic record of places that I visit. I will often show a selection of these to the client, and encourage him/her to ask questions, comment, etc. I find these activities to be good starting points for conversation. I usually use the acoustic shield during these conversations to eliminate the possibility of the client using lipreading cues, and do have some reservations about how natural this approach is, but find that clients rarely comment on it. Most accept it as a part of the training approach, and understand that the focus is on auditory speech perception.
Speech tracking (de Filippo & Scott, 1978) is an important part of almost all training sessions. In speech tracking, there is one talker (usually the clinician) and one receiver (usually the person with hearing loss). In the original form of this approach, the talker read from a text, segment by segment, and the receiver had to repeat each segment without error. If an error occurred, the talker had to re-present the segment until it was repeated verbatim. To reduce the level of difficulty, the talker could give clues, reword, rephrase, etc., to help the receiver correctly identify the original segment. The procedure continued for a specified time period—often 10 minutes—but I strongly recommend going for only 5 minutes. At the end of the time period, the talker counted the total number of words presented and correctly repeated. This figure was then divided by the time elapsed, to calculate the receiver’s TR in words per minute (wpm). For example, if a receiver was able to repeat 250 words in 5 minutes, her/his TR would be 50 wpm.
I use a modified form of the KTH Tracking Procedure in my work with adults. This computer-controlled approach was developed by Gnosspelius and Spens (1992) at the Kungliga Tekniska Högskolan (Royal Institute of Technology) in Stockholm, Sweden. The original program ran in MS-DOS, and a subsequent Windows-compatible version was developed by the Rehabilitation Engineering Research Center on Hearing Enhancement at Gallaudet University in Washington, D.C. (Plant, Bakke, Bernstein, Levitt, & Oden, 2007).
The KTH Tracking Procedure uses live-voice presentation, but with predetermined segment length, and only one repair strategy: repetition. The test is entered line by line onto the computer, and is displayed on a computer monitor for the talker. The talker uses the right-hand button of the computer mouse to start the program and move from line to line, and the left-hand mouse button to mark any word that is not repeated correctly. If a word is not correctly identified after two repeats, the receiver is shown the written word on a separate monitor, and this is recorded automatically on the computer. At the end of the specified time period, the receiver’s TR is automatically calculated and displayed. Separate measures, including the receiver’s ceiling rate (based on lines repeated without error) and the proportion of blocked words, which is based on the number of times words have to be repeated as a proportion of the total number presented, are also automatically calculated.
I have written two long stories, which I use with almost all of my adult clients. The first story, “Kumanjayi,” contains more than 160,000 words and is set in central Australia. It is told in the first person, and is written in a conversational style in an attempt to provide training that replicates, at least in part, everyday communication. The second story, “The Old Ones,” consists of three parts of around 100,000 words each. This story is written in the third person and is designed to be more complex than “Kumanjayi.” It is set in Sweden and Norway and has a fantasy/science fiction theme. Clients report that they enjoy the stories, and this helps them to maintain focus as the want to find out what happens next!
In any training session, I present up to 10 (five Auditory-Visual [AV] and five Auditory-only [A]) 5-minute speech tracking sessions, and I find that clients respond extremely positively to this approach. A client once told me that he regarded his TR in much the same way as his golf handicap, with the only difference being that he wanted his TR to rise, and his golf handicap to fall! Many clients report initially that they derive no benefit from lipreading, but an examination and discussion of their scores in the two conditions and their ratings of effort and confidence quickly reveal to them that visual cues make the task much easier. The materials are usually presented in quiet, but if a client’s scores indicate it will not be too stressful, the materials can also be presented with a competing speaker. I use several different recorded books as the source for the competing speech. Over time, it has been determined that there is a hierarchy of difficulty for the various recordings. The easiest is a woman with a Scottish accent, the next most difficult is a male speaker with a general American accent, and the most difficult is a male speaker of southern British English. The competing speech is set at a level that creates difficulty for the listener, but not so great that s/he has to struggle too greatly to understand what is said.
Bernstein et al. (2012) looked at the use of speech tracking using both de Filippo and Scott’s original procedure (TRAD) and the KTH Speech Tracking Procedure. They found that training with both approaches lead to significant gains in TR and sentence recognition scores. The study also looked at between trial variability for the TRAD and KTH approaches and found that the KTH approach was significantly less variable that the TRAD approach.
Modified Speech Tracking
Speech Tracking, especially via audition alone, is a difficult task for many people with severe or profound hearing losses. My experience with the procedure over the past 30 years indicates that if a client has a TR of 25 wpm or less, the approach should not be used. At this level, the receiver is struggling to perceive single words, so context and story line are no longer of much assistance, and the task is extremely stressful. Clients with poor receptive communication skills need to be encouraged to use their language skills to take the fragmentary information available to them, and to synthesize meaning. Attempts at forcing such clients to relentlessly strive to identify every word may be self-defeating, and lead the client to have an inappropriately low opinion of her/his receptive communication competence.
Commtrac (Plant, 1989, 1996b) was developed for use with those clients for whom speech tracking was unsuitable. In this procedure, a story is presented line-by-line auditory alone, and the client is asked to repeat as many words as possible. After the client has responded, s/he is shown the written line, and the next line is presented auditory alone. There are no time constraints imposed; clients can take as long as they require before they respond, and they are also encouraged to ask for the line to be repeated if necessary. Each story in Commtrac was divided into 200-word parts, which were, in turn, segmented into separate lines for presentation. At the end of each part, the client’s percent correct score is calculated.
I have subsequently prepared a number of different stories that use this approach. One of these, TeenTrax (Plant, 2007), is a story that is divided into more than 40 parts, ranging in length from 200 to 250 words. Feedback is provided to the client using a PowerPoint presentation, with each line being presented on a separate slide. Although originally designed for use with teenagers and young adults, I have used TeenTrax with a number of adult clients, and all have responded favorably to it.
I find this modification of speech tracking particularly useful with clients who are reluctant to attempt auditory only training. Conventional speech tracking, with its insistence on every word being identified, can be intimidating for many clients, and it is better, at least initially, to use this modified approach, which allows for incomplete responses and provides immediate feedback to ensure that the client is able to keep up with the story and use contextual information to assist in understanding.
Case Studies
There are two case studies in this section. The two subjects are a middle-aged woman with a congenital hearing loss, and an older man with a long-term acquired hearing loss. Both attended the Hearing Rehabilitation Foundation (HRF) for training and made a donation to the HRF for each session attended. It should be noted that this fee was voluntary and the service would have been provided regardless of the client’s ability to make a donation.
Case Study 24–1
The client was a female in her late 40s with a congenital hearing loss. She has two younger siblings who also have congenital hearing losses and who are both successful CI users. The client had worn hearing aids since early childhood, but reported that, unlike her siblings, she was unable to understand speech through listening alone. Her hearing aids did, however, serve as a very useful adjunct to lipreading, and she has very well-developed speech and language skills. Her decision to receive a CI was influenced by the excellent performance of her siblings, and her desire to be able to communicate better with her two children.
I first saw the client in late October 2011, approximately 1 month after her CI was activated. At that time, she said she was quite happy with the outcome, but reported that she was unable to understand speech via listening alone. She had opted to continue wearing a hearing aid in her other ear, and all subsequent testing and training was conducted in this combined/bimodal (CI + HA) mode. The client attended training sessions at approximately 2-week intervals for just over 3 years.
The first training sessions focused mainly on closed-set tasks that contrasted word syllable patterns, and used word lists that presented word pairs that differed only in that one contained a target consonant such as [s] and the other didn’t. She was able to perform these tasks very well, and after the first five sessions, I decided to introduce some open-set materials. These included expansion sentences and simple sentences that required the client to fill in the missing words. At first, the client found these tasks very challenging, but over time, she became far more confident in her ability to understand speech through listening alone. She particularly enjoyed the warm-up exercises using closed sets of numbers, colors, vegetables, and fruits presented in isolation and in strings of two, three, or four items. As her ability to perform these tasks improved, I asked her to repeat the items backwards, and she was able to perform this task very well.
In late January 2012, I introduced the TeenTrax tracking program. The first segment of the program was presented auditory-visually to familiarize the client with the procedure. Her score was almost 100% correct, which was a realistic measure of her face-to-face communication skills. All subsequent segments were presented via audition alone. The subject’s performance for this task is shown in Figure 24–1. This shows a steady improvement over a period of approximately 6 months.
As the client’s auditory alone performance for the TeenTrax materials improved, I introduced the sentence lists drawn up using only the 500 most-frequently occurring words in spoken American English (Plant, 2001b). These were presented auditory alone, and each sentence was repeated before the subject was asked to repeat what she had heard. One sentence list was presented at each training session, and the results of this testing is shown in Figure 24–2. The results are similar to those obtained with the TeenTrax materials, a steady rise in performance across the course of training, which was surprising given that the lines in the TeenTrax are related and provide the opportunity to use contextual cues to assist in understanding.
Figure 24–1. Percent correct scores for TeenTrax materials obtained by the subject in Case Study 24–1.