Voice Treatment: Orientations, Framework, and Interventions

Habilitation is the processes of helping an individual with an impairment or limitation realize their maximum potential—its purpose is to enable and enhance. Rehabilitation is the process of helping an individual recover function after an injury or illness—its purpose is to repair and restore. Speech–language pathologists (SLPs) who administer voice treatments are involved in both voice habilitation (e.g., facilitating skills to produce a desired voice in a healthy way for professional or performance voice needs) and voice rehabilitation (e.g., restoring phonation abilities secondary to vocal fold paralysis). The SLP who provides voice therapy is often involved in both habilitation and rehabilitation simultaneously. That is, while we attempt to restore vocal function secondary to disorder or disease, we must recognize that, depending on factors such as the severity of the disorder, we may never be able to restore a typical or “normal” voice. Instead, we focus on the patient achieving their maximum potential to produce the best voice possible. SLPs who practice this subspecialty (voice impairments) are sometimes referred to as vocologists (“vocology” refers to the science and practice of voice habilitation) or voice therapists. A smaller subset of voice therapists also have a background in singing voice pedagogy and are able to work with singers on not only voice production and vocal health but also singing technique, and are referred to as singing voice specialists.


The profession of SLP has provided services to patients with vocal impairments since it was first officially organized. Over time, our knowledge of vocal anatomy and physiology, the interaction of voice production and the psychological state of the patient, and the effectiveness of specific treatments has advanced our understanding of how the larynx responds to various habilitation and rehabilitation processes. Today voice therapists base treatment selection on (1) the patient’s needs and goals, (2) the underlying impaired physiology, (3) the research evidence that supports a specific treatment, and (4) the competency (knowledge and skill) of the clinician. Selected treatments then rehabilitate or habilitate vocal function by targeting the underlying physiological impairments and/or improving physiological function to advance the patient’s ability to vocalize or communicate via phonation.


In this chapter, we will use rehabilitation terminology with specific meanings that you should commit to memory to best understand the material. The words “rehabilitation,” “treatment,” and “therapy” have been used in different contexts by different authorities. For the purposes of this book, we will define these terms as follows:




  • Voice habilitation/rehabilitation—the process of enhancing or restoring vocal abilities to a desirable (habilitation) or previous (rehabilitation) level of function. This process is conducted via patient and clinician collaboration (including, where necessary, affiliated health professionals) and typically includes pedagogy (education), counseling, and the administration of one or more specific voice treatments through extrinsic or intrinsic delivery methods. We will use the word “rehabilitation” throughout the remainder of this chapter to refer to both habilitation and rehabilitation processes.



  • Voice therapy—synonymous with voice rehabilitation.



  • Voice treatment—a specific method (technique) of care provided to a patient, focused on one or more of the following voice production domains: musculoskeletal, respiratory, vocal function, auditory, or somatosensory. Alternatively, a voice treatment can be administered indirectly in the context of pedagogy, counseling, or vocal hygiene treatments. Voice treatments, when used to achieve clinical goals, will include a specified framework, duration, frequency, and expected level of outcome.



  • Voice tool—synonymous with voice treatment.



  • Voice treatment program—an organized group of treatments, typically structured in sequence with each other, and administered using one or more delivery methods. Traditionally, these have been referred to as “eclectic voice therapy.”



  • Voice disorder—The presence of a deviation in voice quality, pitch, loudness, flexibility, and/or stamina from expectations related to factors such as age, sex, body type, speaking community, communication needs, and/or performance needs.



  • Vocal impairment—synonymous with voice disorder.


When we administer a treatment as part of the rehabilitation process, we do so with the goal of alleviating an impairment to reduce or eliminate a handicap or disability. The selection of specific treatments is dependent on the diagnosis and the associated etiology. This means that treatment decisions should only be determined after (accurate) diagnosis has been prescribed and a firm hypothesis regarding the underlying physiological impairment has been determined. The rationale for this is clear: certain voice treatments are inappropriate, and can be contraindicated, for certain diagnoses and etiologies. For example, you would not use circumlaryngeal massage as the primary treatment for a case of laryngeal cancer or choose treatments that would encourage increased muscular effort for cases of phonotrauma such as vocal nodules.


8.3 Voice Treatment Orientations


Through the years, there have been a number of philosophies and principles that have provided a basis for the goals of voice therapy as a whole, and for support underlying voice therapy techniques. The following provides several examples 1:


8.3.1 Symptomatic Voice Therapy


This treatment orientation focuses on targeting the functional misuse(s) or abuse(s) of the voice that are responsible for perceived atypical vocal characteristics such as breathiness or atypical low pitch. Once identified, misuses (behaviors that result in inefficient/ineffective technical use of the voice) and/or abuses (inefficient/ineffective forms of vocalization that have a greater tendency to damage the vocal fold mucosa via phonotrauma) are eliminated or reduced through various voice therapy–facilitating techniques. A voice therapy–facilitating technique is a technique which, when used by a patient, enables him/her to easily produce an improved voice. The facilitating technique and the resulting improvement in voice characteristic(s) become the symptomatic focus for voice therapy. 2 Many of these facilitating techniques attempt to elicit improved vocalization by eliciting nonlinguistic or nonpurposeful forms of phonation. As an example, vegetative vocalization (cough, throat clear, and/or humming that is shaped into connected speech) has been used effectively as a treatment for patients with mutational falsetto and has been implemented by these authors to elicit improved voice function in severely hyperfunctional cases of aphonia. 3 In summary, the main focus of symptomatic therapy is the direct modification of the vocal “symptoms” (i.e., atypical voice characteristics) to minimize phonotrauma and inefficient vocal technique through the application of facilitative techniques.


8.3.2 Hygienic Voice Therapy


This treatment orientation focuses on the improvement of the health of the vocal fold cover and surrounding laryngeal tissues via modifying or improving vocal hygiene. Some examples of behaviors representative of poor vocal hygiene include excessive loud voice (e.g., shouting, talking loudly over noise, and vocal noises during play), excessive loud coughing and throat clearing, and poor hydration. When the inappropriate behaviors are identified, treatments and recommendations are provided that will modify or eliminate the condition and result in improved laryngeal health and environment. Once modified, voice production has the opportunity to improve or return to normal. Inappropriate voice use may also be a factor in poor vocal hygiene.


8.3.3 Etiologic Voice Therapy


This type of treatment philosophy is based on the view that the primary focus of voice therapy is on the identification and subsequent modification and/or elimination of the underlying physiological cause(s) (i.e., the etiology) of the presenting voice disorder. This orientation is aligned with the biomedical model, which assumes that physiological impairment is related to a finite number of etiologies. When these causes are removed or reduced, the disability resulting from the physiological impairment will be modified or eliminated. 4 Furthermore, when the initial cause is no longer present but changes in function persist, those factors responsible for maintaining the impairment can be identified and subsequently eliminated or modified. Once all etiologic factors are treated, the expectation is that disordered voice characteristics will improve.


8.3.4 Psychogenic Voice Therapy


This type of therapy orientation views emotional and psychosocial disturbances as key factors in the initiation and maintenance of the presenting voice disorder. 5 The relationship(s) between factors such as stress, anxiety, and the presenting voice characteristics are key elements that are brought to the attention of the patient during treatment. The expectation is that once identification of and reduction/elimination of these factors is achieved, voice production will improve.


8.3.5 Physiologic Voice Therapy


Voice treatment programs targeting more than one direct physiological domain have been referred to as physiological voice treatment approaches. 1 Since phonation is produced when respiratory pressures and airflow interact with laryngeal structures capable of vibration, this treatment approach focuses on modifying and improving the balance between respiratory and laryngeal systems. In particular, the primary focus is on (1) directly modifying and improving the balance of laryngeal muscle effort to the supportive airflow and (2) on effectively resonating and focusing the phonatory vibration for improved vocal quality. The physiologic voice therapy approach attempts to directly modify ineffective or inefficient respiratory/laryngeal physiologic activity through direct therapy techniques.


8.3.6 Eclectic Voice Therapy


This treatment orientation incorporates various views and techniques from the aforementioned philosophies into a comprehensive approach to voice treatment. 1,​ 6 As an example, the therapist may have key goals of (1) improving vocal hygiene (hygienic voice therapy); (2) identifying and relieving causes of phonotrauma and inefficient vocal technique (etiologic voice therapy); (3) emphasizing relaxed, easy onset phonation via a yawn/sigh technique (symptomatic voice therapy); and (4) improving vocal efficiency and effectiveness via the use of vocal function exercises (physiologic voice). This orientation allows for a flexible selection of voice treatment approaches customized to the individual patient.


8.3.7 Dysphonic Physiology Reversal


Our orientation to effective voice therapy and the selection of voice therapy techniques is based on a very simple but, in our experience, highly effective tenet: dysphonic physiology reversal (DPR). Each word in the name of this approach to voice treatment has importance.


Dysphonic


Dysphonic refers to disrupted phonation. You will see that this philosophy or approach to treatment, as with all potentially successful forms of treatment, is inseparable and dependent on accurate and detailed diagnosis. Diagnosis refers to the detection and description of the cause and features of a disease. 7 To prescribe an effective treatment, it is essential that we can accurately describe the cause and disruption affecting the patient’s ability to produce phonation/voice. This description may be based on individual history, perceptual, acoustic, aerodynamic, and laryngeal imaging observations, or ideally from a multidimensional combination of these forms of voice assessment.


Physiology


Once we have clearly defined the cause and described the dysphonia, we must develop a viable hypothesis regarding the underlying physiology of the impairment. Physiology is the study of how structures such as cells, muscles, organs, or systems interact to produce a particular outcome. In the case of dysphonia, we must develop a hypothesis regarding the underlying physiological impairment associated with the voice disorder (i.e., how structures are or are not working, or are or are not working together to produce the observed dysphonia) and potential behaviors/factors which may be maintaining disability associated with that impairment.


Reversal


With the development of a strong and logical hypothesis regarding the underlying physiological impairment associated with the observed dysphonia, the last step in this treatment philosophy/approach is very simple: reverse the underlying physiological disruption. There may be several therapy techniques that may be applicable to treat and reverse the underlying physiological deficit, and we may choose a particular technique/method based on factors such as the severity of the presenting disorder, the duration of the dysphonia, patient health status, clinician familiarity, and expertise with certain treatment techniques. Regardless of the method selected, if (1) our hypothesis regarding the underlying physiological deficit is correct and (2) our selected treatment technique logically aids in reversal of the physiological deficit, some degree of success in voice therapy will be probable.


Here are a few examples to illustrate the DPR approach to voice therapy:




  1. Perception of dysphonia characterized by breathy voice → underlying physiological deficit is hypothesized and confirmed via observation to be hypoadduction → selection of treatment method(s) that reverse hypoadduction (i.e., increase vocal fold adduction). These options may include behavioral (e.g., exercises facilitating increased muscle activation to achieve glottal adduction) or medical options (e.g., vocal fold augmentation).



  2. Perception of dysphonia characterized by strained voice and fatigue → underlying physiological deficit is hypothesized and confirmed via observation to be hyperfunction and increased muscle tension → selection of treatment method(s) that reverse hyperfunction and decrease muscle tension. These options may include behavioral (e.g., laryngeal massage) or medical options (e.g., use of Botox in adductor spasmodic dysphonia cases).



  3. Perception of dysphonia characterized by low pitched, rough voice → underlying physiological deficit is hypothesized and confirmed via observation to be increased mass (swelling) and accompanying irregular vocal fold vibration → selection of treatment method(s) that reverse (i.e., decrease) vocal fold swelling. These options may include behavioral (e.g., vocal hygiene techniques, techniques to reduce phonotrauma, focus on reduction in vocal loudness) or medical options (e.g., use of prednisone to reduce swelling).


Key underlying physiological deficits that we may attempt to “reverse” in DPR are excessive laryngeal motor activation in the form of vocal hyperfunction and/or reduced laryngeal motor activation in the form of vocal hypofunction. Vocal hyperfunction is exemplified by muscle tension dysphonia (MTD), where the impaired physiological balance usually includes excessive activation of intrinsic and extrinsic laryngeal, supralaryngeal, and respiratory muscles. Voice disorders associated with phonotraumatic behaviors would also fall into this category. Vocal hypofunction is exemplified by vocal fold paresis/paralysis, where reduced activation of intrinsic laryngeal muscles causes glottal incompetency. Vocal fold bowing and/or atrophy associated with presbylaryngis and the physiological changes to the respiratory, laryngeal, and articulatory systems subsequent to Parkinson’s disease would also fit into this category. It is important to remember that glottal insufficiency does not always relate to hypofunction, as some forms of MTD result in hyperfunctional activation of both adductor and abductor muscles (which are antagonistic to each other) resulting in an excessive posterior glottal gap.


It is critical to the DPR orientation that the clinician also assesses and identifies any psychological processes tied to the onset, development, and/or maintenance of the voice disorder. As noted in ▒Chapter 2░, psychological processes are intimately linked to and constantly influence phonation. A comprehensive understanding of the patient’s personality characteristics, their specific needs, and how they react to their environment will help the clinician understand the multidimensional nature of a voice disorder. This information can then be used to develop a comprehensive DPR treatment approach, which in some cases may include counseling and/or referral to appropriate mental health professionals.


8.4 An Organizational Framework for Voice Treatments


Depending on which orientation of voice therapy we ascribe to, we will next have to determine the specific voice treatment technique(s)/method(s) that we will use to achieve the overarching goal of the selected voice treatment approach. In the following sections of this chapter, we will organize voice treatments into a taxonomy structured around direct interventions and indirect interventions. We will also identify those treatments which are effective for specific underlying physiological disruptions common to many vocal impairments, including vocal hyperfunction and vocal hypofunction. It will be emphasized that many treatments cross categorical boundaries and can be applied to both hyperfunctional and hypofunctional impairments. In addition, mitigating barriers to treatment participation and compliance and taking advantage of opportunities that will facilitate treatment success will be discussed.


Numerous models for classifying voice treatments have been proposed, although no consistent model has been universally accepted. Van Stan et al published a comprehensive and detailed taxonomic model which can be used as a classification scheme for most voice treatments supported with published research evidence. In this model, specific voice treatments are referred to as “tools.8 As illustrated in ▶ Fig. 8.1 and ▶ Fig. 8.2, the taxonomy consists of two levels. At the first-order level, voice treatments are organized into two intervention approaches: indirect interventions and direct interventions. Indirect interventions include treatments that modify the cognitive, behavioral, psychological, and physical environment in which voicing occurs. These environmental characteristics are modified through two possible domains, pedagogy and counseling, which are defined as follows:




  • Pedagogy: rehabilitating or enhancing vocal function by providing knowledge and strategies to modify vocal health.



  • Counseling: rehabilitating or enhancing vocal function by identifying and modifying psychosocial factors that negatively impact vocal health.



    Taxonomy of voice treatments. The first-order level organizes treatments into categories of direct and indirect intervention approaches, and identifies the delivery methods with which those treatments


    Fig. 8.1 Taxonomy of voice treatments. The first-order level organizes treatments into categories of direct and indirect intervention approaches, and identifies the delivery methods with which those treatments can be administered. Within the direct intervention category, voice treatments are organized into five different domains of modification: auditory, somatosensory, musculoskeletal, respiratory, and vocal function. Examples of these treatments are shown within these domains. Within the indirect intervention category, voice treatments are organized into two different domains of modification—pedagogy and counseling. The domains of modification in either intervention category can be delivered using extrinsic or intrinsic methods.


    (Used with permission from Van Stan JH, Roy N, Awan S, Stemple J, Hillman RE. A taxonomy of voice therapy. Am J Speech Lang Pathol 2015; 24(2): 101–125.)



    The second-order level more thoroughly organizes treatments into the domains of modification for the direct intervention category. Each treatment is administered by focusing the patient’s attention to


    Fig. 8.2 The second-order level more thoroughly organizes treatments into the domains of modification for the direct intervention category. Each treatment is administered by focusing the patient’s attention to the domain in which it resides. Some treatments overlap domains of modification. This allows the clinician to refocus the patient’s attention to another domain while utilizing the same treatment.


    (Used with permission from Van Stan JH, Roy N, Awan S, Stemple J, Hillman RE. A taxonomy of voice therapy. Am J Speech Lang Pathol 2015; 24(2): 101–125.)


In contrast, direct interventions include treatments that modify vocal behavior through one or more physiological domains including musculoskeletal, respiratory, vocal function, auditory, and somatosensory modification. These five domains can be defined as follows:




  • Musculoskeletal modification: rehabilitating or enhancing motor execution by modifying muscular, skeletal, and/or connective tissue structures and systems.



  • Respiratory modification: rehabilitating or enhancing motor execution by modifying respiratory function.



  • Vocal function modification: rehabilitating or enhancing motor execution by modifying phonation.



  • Auditory modification: rehabilitating or enhancing auditory-perception by modifying the auditory input.



  • Somatosensory modification: rehabilitating or enhancing somatosensory perception by modifying somatic and visual input.


The administration of any specific tool within a domain requires the application of a specific method of delivery. Extrinsic treatments are those administered and overseen solely by a clinician, whereas intrinsic treatments are those administered and overseen by the patient (under clinician guidance).


8.5 Indirect Interventions: Pedagogy, Vocal Hygiene, and Counseling


8.5.1 Pedagogy


Vocal pedagogy entails imparting knowledge to the patient by explaining the anatomical and physiological processes of phonation in terms they can understand, and teaching them behaviors which will promote vocal health over time. Pedagogical intervention is a characteristic of most comprehensive voice treatment programs. One important reason for the application of pedagogy and counseling during the initial stages of voice rehabilitation is to shape the patient’s perspective regarding the acceptance and adherence of treatment. Both knowledge and perspective can impact a patient’s acceptance of treatment, which in turn is a major factor influencing his/her adherence to a treatment program. 9 Patients who do not understand why you are asking them to perform a behavior or how that behavior relates to vocal improvement are likely to lose motivation, and those who are less motivated are less likely to comply with treatment recommendations.


Pedagogy includes knowledge enhancement (education) and implementation of vocal hygiene programs. Knowledge enhancement is designed to increase the patient’s understanding of how they create sound through phonation, and how behaviors or conditions can negatively influence that process. When explaining these concepts to a patient, a recommended core set of topics includes the following (described in language familiar to the patient):




  • Basic structure of the larynx.



  • Relationship between respiratory, laryngeal, and articulatory systems.



  • Process of phonation in healthy, nondysphonic production.



  • Effect of phonotrauma on vocal fold tissue and voice quality.



  • Effect of hydration, irritants (including reflux) on vocal fold tissue and voice quality.



  • Effect of lesions, paralysis (if applicable to the patient) on phonation and voice quality.



  • The role of treatment in rehabilitating vocal impairments.


It is useful to utilize visual aids when educating the patient, which are readily available from images in this book or via download from the Internet. Additionally, the laryngeal examination video (endoscopic and/or stroboscopic) can itself be a powerful tool for learning. 1


8.5.2 Vocal Hygiene


The purpose of a vocal hygiene program is to establish knowledge and behaviors which will expose the larynx to an environment where internal and external irritants are minimized. Although vocal hygiene alone is often not sufficient for rehabilitating most vocal impairments, this form of treatment can help reinforce the clinical effects of other direct and indirect interventions. Interprofessional collaboration with the otolaryngologist is important when initiating vocal hygiene treatment as certain recommendations (e.g., hydration through increased fluid intake) need to be made in the context of the patient’s larger medical management plan and ensure that recommendations/treatments do not conflict. ▶ Table 8.1 illustrates an example of a vocal hygiene program which includes four domains. This can be used as a guide for treatment, and a version of ▶ Table 8.1 can be provided to the patient as additional learning material. The four domains described in this table include




  • Hydration: The influence of internal and external hydration on phonation physiology is explained during the process of education. In vocal hygiene, specific recommendations are provided to the patient that will establish behaviors to promote sufficient hydration.



  • Respiratory health: The heat and chemicals in cigarette smoke can both irritate and damage vocal fold tissue. An emphasis on exposure to smoke is an important domain relative to hygiene. Smokers are encouraged to stop or reduce the frequency of smoking in order to minimize vocal fold exposure. Patients who are frequently exposed to second-hand smoke are encouraged to change lifestyle behaviors to limit that exposure.



  • Vocal habits: Phonotraumatic behaviors impacting the tissue of the vocal folds are explained during education. Here, the clinician provides specific recommendations to reduce or eliminate phonotrauma, individualized to the patient based on their reported vocal habits.



  • Vocal irritants (including reflux and postnasal drainage precautions): Laryngopharyngeal reflux (LPR) and postnasal drainage are associated organic etiologies in many patients with voice disorders. If a patient has a history of LPR or postnasal drainage, it is crucial that steps are taken to minimize laryngeal exposure, as indicated in ▶ Table 8.1. Although not listed in the table, an emphasis on adhering to prescribed medication schedules should be provided, as failure to consistently take reflux medication or use prescribed nasal sprays impacts their clinical effectiveness.

































































    Table 8.1 Vocal hygiene guidelines

    Domain


    Recommendations


    Hydration


    64 oz of water per day



    Limit caffeine intake



    Limit alcohol



    Offset drying medications with extra water



    Use steam inhalation if beneficial



    Avoid menthol products


    Respiratory health


    Avoid smoking



    Limit exposure to 2nd hand smoke


    Vocal habits


    Instead of clearing throat, dry swallow or wet swallow; use hard swallow if those do not work alone



    Instead of shouting, use noise device to call attention



    Limit vocal loudness when speaking in the presence of loud background noise



    Do not sing out of your range



    Try to limit overall amount of talking time



    Take frequent voice rests


    Vocal irritants (including reflux and postnasal drainage)


    Know your reflux and allergy triggers and use medications as prescribed



    Limit intake of spicy, acidic, and fatty foods



    Try not to eat or drink within 3–4 h of going to bed



    Elevate head of bed


8.5.3 Counseling


The process of counseling in the context of voice rehabilitation involves rehabilitating or enhancing vocal function by identifying and modifying psychosocial factors that negatively impact vocal health. Some information related to these factors will be obtained during the diagnostic interview (case history). During the initial phase of treatment, counseling can help the patient further understand how psychosocial factors may contribute to their voice impairment. This process is best accomplished with an open and honest discussion regarding how the patient’s personality, life situations, and reactions to stress manifest in laryngeal signs and symptoms. Many patients do not make a connection between stressful life events or their personality characteristics and the onset or maintenance of voice problems. Counseling brings these relationships to light and can help the person better understand how the physiological problem they are experiencing is connected to their own cognitive processing. Referral for professional psychological counseling that goes beyond explaining the relationship between voice and factors such as stress and anxiety should be a necessary consideration.


8.6 Direct Interventions: Facilitative Approaches for Vocal Hyperfunction


Direct interventions include treatments that modify vocal behavior through one or more physiological domains (including musculoskeletal, respiratory, vocal function, auditory, and somatosensory modification). These behavioral treatments allow the clinician and patient to directly target, reverse, and thus rehabilitate underlying impairments in which the disruption occurs in one or more of those physiological domains. Some of these treatments target a singular component of physiology rather than multiple sources of impaired physiology. Treatments which target singular components of voice have been referred to as facilitative or symptomatic approaches. 2,​ 10 Many of these treatments have a long history of use in SLP, and several have been precisely described in other publications. 2 We will summarize a sample of facilitative approaches which, in our own clinical experiences, have proven effective for selected patients with voice disorders associated with both hyperfunction and hypofunction.


8.6.1 Resonant Focus


As a facilitative technique, resonant focus can be used dynamically in conjunction with other techniques to “unload” laryngeal muscular hyperfunction. This is accomplished by establishing an awareness of oronasal vibratory sensations during vocalization. These sensations arise when voice energy resonated in the facial region is associated with vocal folds that are not excessively compressed together. 11 Resonant or frontal focus is also a fundamental target of a frontal resonance treatment (FRT) approach (see subsequent section).


A useful tool for establishing resonant focus is humming. This gesture facilitates vibrotactile sensory awareness due to the sound energy being concentrated in the nasal cavity and along the palate. This effectively moves the placement of sound energy perception from the larynx to the facial region. Patients with severe hyperfunctional dysphonia may need to produce a hum in different postures to achieve an appropriate degree of sensory awareness. This might include gently tucking the chin toward the chest or leaning over with the head facing the floor and the forearms resting on the knees while producing repeated hums. Humming is typically best elicited by modeling the behavior for the patient, prompting them to also take in a comfortable, supportive breath each time.


To monitor performance accuracy, the clinician can assess (1) his/her own perception of the patient’s voice quality, which should improve when a voice with resonant focus is produced correctly; (2) the patient’s perception of the sound quality; and (3) the patient’s perception of vocal effort. Accurate productions of resonant voice when humming should result in improved voice quality perceived by the clinician and patient, in addition to a sense of minimal effort by the patient. In the authors’ experience, some patients may require ear training to establish a perceptual anchor for what improved voice quality sounds like.


Once improvements in voice quality are established with humming, the goal is to shape the resonant production into speech. This can be accomplished using a traditional hierarchy of a nasal + vowel (e.g., hum, followed by “ma-ma-ma”), single syllable words starting with nasals (e.g., “hum, followed by “mom,” “mum,” and “non”), single-syllable words with and without nasals (e.g., numbers—hum followed by “one,” then “two”), multisyllabic words, phrases, sentences, and conversation.


8.6.2 Tongue Advancement


The goal for patients with vocal hyperfunction is to change the maladaptive motor pattern into one that is more efficient and effective for communication. However, some patients have difficulty consciously monitoring their own internal sensations, monitoring their own voice quality, or experience severe degrees of hyperfunctional muscle contraction such that achievement of initial therapeutic goals is difficult. In these cases, tongue advancement (e.g., “protrusion”) may be attempted as a facilitating technique. Because the tongue is attached directly to the hyoid bone, which is itself attached to the larynx, forward movement of the tongue dorsum applies traction to the larynx, altering the degree of muscular tension in the laryngeal muscles. This muscular response often has the effect of changing the patient’s hyperfunctional pattern so that vocal fold vibration becomes more periodic and voice quality improves. Tongue advancement also opens the pharynx, which may lead to less laryngeal compression. In contrast, tongue retraction (as in back focus) compresses the pharynx, larynx, and results in increased muscular tension in the laryngeal region.


Tongue advancement is best accomplished by utilizing a high front vowel, such as /i/, 2 as follows:




  1. The clinician prompts the patients to slightly open their mouth and stick out their tongue so that it is past the lips (but not as far as they can protrude it—the posture should be comfortable for the patients).



  2. They are then asked to take an easy breath in and sustain the /i/ vowel for a few seconds, with attention paid to voice quality and level of effort. This is repeated multiple times, and different degrees of protrusion and mouth opening can be explored in attempts to achieve a target voice quality.



  3. When voice quality improves and the patient indicates that voice production is not effortful, they are asked to retract the tongue to the level of the lips and produce /mi-mi-mi/ with the tongue staying between the lips on each repetition.



  4. The goals continue to remain as a perception of improved voice quality and reduced level of perceived effort. With continued accuracy, the patient is asked to move the tongue inside the mouth to the natural /i/ position while repeating the syllable. With continued mastery, the clinician can introduce the traditional hierarchy of stimuli with the goal of progressing to conversational speech while maintaining the improved voice quality and ease of effort.


8.6.3 Vegetative and/or Automatic Voicing


Vegetative voicing includes non-speech vocalizations such as throat clearing, grunting, phonating on inhalation, coughing, and the common form of automatic voicing resulting in the production of “uh-huh” as a conversational response. These types of voicing can be useful starting points in the treatment of patients with severe MTD in the form of aphonia (sometimes referred to as non-adducted hyperfunction and traditionally as a form of psychogenic aphonia). In these cases, an initial goal of treatment is to achieve vocal fold adduction which allows for phonation of any type. This can be accomplished with numerous techniques, including the use of vegetative/automatic voicing tasks that provides the patient with a physical and auditory cue that vocal fold adduction and voicing is possible.


Attempts to elicit voicing often require the clinician to move quickly from one technique to another. The clinician will elicit productions by choosing an initial vegetative or automatic gesture and model it for the patient. If the patient fails to produce voice after multiple attempts, the clinician models a different gesture and asks the patient to repeat. Experimentation is often necessary in the form of prompting the patient to produce the gestures at different pitches and loudness levels. It can also be useful to cycle through the techniques, coming back to the first or second vegetative/automatic voice type after attempting the other forms.


Once achieved, it is crucial that the clinician brings the occurrence of phonation to the conscious awareness of the patient and has them repeat the gesture. With repeated success on one type of vocalization, the clinician should then prompt the patient to vary the production by changing pitch, changing loudness, and changing duration. With continued success, the vegetative/automatic vocalization should be shaped into a vowel (e.g., vocalization + vowel), followed by a traditional stimulus hierarchy. Implementation of vegetative/automatic voicing, especially in patients with long-standing aphonia, requires persistence and perseverance from the clinician. Typically, maladaptive hyperfunctional patterns of phonation do not improve in seconds or a few minutes. However, for the patient without significant psychological comorbidity, these techniques can be very successful in eliciting an improved or even normal voice quality in the first treatment session.


8.6.4 Chewing


The chewing method was first described by Emil Fröschels as a treatment for vocal hyperfunction. 12 The theoretical basis for the application of this approach centers on the shared neuromuscular pathways employed during the act of speaking and mastication. Froeschels reasoned that voiced speech sounds were a natural evolution from noises produced during chewing while eating food. 12 He generated a clinical hypothesis which suggested that if the neuromuscular pathways producing these chewing noises were the same as those for laryngeal function during speech, one behavior could be used to influence the other. This was supported by the observation that chewing movements and chewing noises remain unaffected in many patients with hyperfunctional voice disorders, even though similar muscles are being used for voice production during speech. 13 Based on this rationale, he developed a treatment approach which utilizes exaggerated chewing movements of the tongue, facial, and mandibular muscles in concert with voice production to counteract the hyperfunctional contraction of laryngeal muscles associated with voice impairment and dysphonia. The steps in utilizing the chewing method are as follows:




  1. The clinician introduces the concept of chewing to the patient as a method of removing excessive laryngeal tension during speaking. It is emphasized that vocal noises can be produced while chewing, which is what the clinician and patient will explore together. The clinician should emphasize that once an improved voice quality is achieved, chewing movements will be reduced.



  2. The clinician prompts the patient to produce exaggerated movements of the jaw as if chewing a large bolus. In prompting the patient, the clinician first provides a model and asks the patient to replicate. Key concepts of instruction include the following:




    1. The jaw position for a majority of the movements should be lowered so that the mouth is open. Closing movements of the mouth will correspond to adduction of the lips during chewing motions. Jaw movements are vigorous and should include up-down and side-to-side actions. The patient can be given a cue to chew as if they had a large piece of meat or bread in their mouth.



    2. Lip movements are exaggerated by wide opening, closing, pursing, and retraction (e.g., smiling) gestures.



    3. Tongue movement includes large excursions throughout the oral cavity in all directions.



  3. As the patient replicates the exaggerated chewing, the clinician introduces visual feedback utilizing a mirror. The patient is asked to continue exploring random chewing movements while being reminded to move the lips and tongue extensively. The clinician continues to monitor the patient to ensure that range of movements are substantial, especially those of the tongue.



  4. The patient is asked to add voicing during the act of chewing. The resulting sound is monitored by the clinician for two characteristics:




    1. The resulting speech sounds should perceptually resemble variegated babbling and include a range of phonemes due to the numerous positions of the articulators during chewing actions. If a patient produces a monotonous “yam yam” it is a cue that the tongue is not moving throughout a wide range of motion.



    2. If laryngeal tension is in fact decreased, the quality of the voiced sounds should improve. This is noted by the clinician and brought to the attention of the patient so that their perception is acutely aware of the voice change.



  5. Once an improved voice quality is established consistently, the patient is then asked to continue the chewing actions while chanting the numbers one through ten, as if counting slowly. It is not important that articulation of these numbers is precise—the focus is on voice quality rather than articulatory precision.



  6. Once improved voice quality during counting is consistent, the patient is prompted to chant phrases while chewing in the same manner. Initial phrase stimuli can be structured with many nasal consonants to promote a frontal resonance but should transition to stimuli with a wide variety of sounds.



  7. Once improved voice quality during phrase production is consistent, the patient is then prompted to decrease the large chewing movements into a more natural oral posture during speech, with an emphasis on flexibility of the jaw, tongue, and lips. The phrases are repeated in this manner while chanting, and perceptual focus remains on the maintenance of an improved voice quality.



  8. Once consistency is established, the patient is asked to transition the phrases from chanting to a more natural speech prosody. Finally, the patient is asked to read passages and then engage in conversation while maintaining the improved voice quality.


8.7 Direct Interventions: Facilitative Approaches for Vocal Hypofunction


8.7.1 Glottal Adduction Exercises


Exercises requiring maximum concentric contraction of intrinsic vocal fold muscles have been utilized in combination with other techniques to rehabilitate adductor muscles when physiological impairments cause vocal hypofunction. The theory behind these exercises is that vocal fold adduction performed with maximum residual motor unit recruitment during muscular contraction will facilitate neuromotor adaptation in the target muscles and neuromotor pathways, particularly when completed in sets of numerous repetitions and with high frequency across a day or week. These exercises will be most beneficial when combined with phonation attempts in a hierarchy of voicing difficulty (moving from sustained vowel productions to syllables and toward actual speech contexts). Two examples of glottal adduction exercises include push/pull and glottal attack exercises.


Push/pull exercises can be performed in diverse ways, such as in the following methodology:




  1. The clinician prompts the patient to press or pull against a surface, as if holding their breath as tightly as possible. 14 This exercise promotes maximal adduction of the vocal folds by simulating the glottal closure reflex.



  2. The patient is instructed to hold the contraction for 3 or 4 seconds, and then release. This is repeated multiple times to complete a set, with multiple sets completed numerous times each day.



  3. The technique can be performed while seated, where the patient presses with both hands against the surface of their chair or the desk in front of them, or also be performed while standing where the patient presses against a wall or table or presses both hands against one another. In addition, these exercises may also be carried out using an isometric contraction in which the patient links the fingers of each hand together and pulls to initiate glottal closure.



  4. The exercises can be combined with phonation by asking the patient to perform the push/pull maneuver while attempting to phonate a sound. It is essential that the clinician help the patient to time the contraction to occur along with phonation attempts. With this technique, the goal will be to shape away the push/pull gesture while maintaining the improved glottal closure and sound quality.


While push/pull exercises are not typically used as the only technique for glottal insufficiency, their use in combination with other techniques has been supported by a number of published investigations. 15,​ 16,​ 17,​ 18 It should also be remembered that there may be patients who, due to physical health, may not be candidates for this treatment, such as those with high blood pressure. 19


Glottal attack exercises attempt to achieve a similar effect as the push/pull exercises with the exception that the patient will hold the contraction for only a few seconds and then release it into a sustained vocalized vowel. This can be prompted with instructions such as: “Take a deep breath, hold your breath by squeezing tightly with your vocal folds, and then say “ahhhh” with a loud volume.” With mastery, the technique can be modified to initiate phonation immediately after maximum glottal adduction without holding the breath. Glottal attack exercises have been utilized for glottal incompetence that affects both voice production and swallowing. 10,​ 17,​ 20 For those who manifest slower progress, another exercise, the “pseudo-supraglottic swallow,” can be added to the sets. This exercise requires the patient to take a breath and hold it as in the initial part of the supraglottic swallow, but instead of swallowing the patient is asked to cough forcefully after a few seconds of breath hold. This is repeated, along with the other exercises, in multiple repetitions over 5 minutes, and performed numerous times each day. A similar exercise to the pseudo-supraglottic swallow is the “half-swallow boom,” which requires the patient to take a breath, swallow, and attempt to produce a sharp “boom!” in the middle of the swallow while the vocal folds are maximally adducted. Evidence for the effectiveness of this exercise when combined with other treatments has been reported in multiple studies. 21,​ 22


8.7.2 Digital Manipulation


Using the hands for applying pressure to the thyroid cartilage on the affected side in unilateral vocal fold paralysis is another technique which has been utilized to facilitate and improve glottal closure. 15,​ 17,​ 22 One approach to digital manipulation is as follows:




  1. The fingers of a free hand (the other hand can be used to support the patient’s head) are placed on the thyroid lamina of the weak/paretic side, near the middle of the vertical dimension of the thyroid. The thumb can be placed on the opposite thyroid lamina to fixate the unaffected side so that it can serve as a base against which the opposite side approximates.



  2. Using the fingers, the clinician applies inward (medial) pressure to the thyroid, pushing the affected side toward the midline. The patient is asked to vocalize by sustaining a vowel as the clinician applies the pressure.



  3. As the patient produces a vowel, the clinician and patient should listen for a change in voice quality. Varying degrees of pressure can be applied to elicit improved voice, monitoring and querying the patient about his or her level of comfort.



  4. When a positive voice change is elicited, the clinician brings this to the conscious awareness of the patient. When both clinician and patient can perceive the change and it can be consistently elicited on multiple trials (e.g., 8-10 consecutive trials), the clinician then begins to eliminate the manipulation by releasing pressure during vocalization on successive attempts.



  5. The patient is asked to maintain the improved voice as pressure is released. This requires greater muscular effort on the part of the patient, which is the motor behavior this exercise is designed to elicit. With mastery, the clinician prompts the patient to produce the vocalization without digital manipulation, and then voice production is moved through a hierarchy of stimulus contexts terminating in conversational speech with the improved voice.


8.8 Direct Interventions: Comprehensive Voice Treatment Programs for Vocal Hyperfunction


Voice treatment programs are organized collections of different treatments representing varied direct and indirect domains. Most voice treatment programs are delivered with some extrinsic structure (clinician led), but may also include intrinsic delivery structures as part of the program. Seven different voice treatment programs were recently described within the context of the voice treatment taxonomy. 8 Voice treatment programs targeting more than one direct physiological domain have been referred to as physiological voice treatment approaches, and when incorporating both direct and indirect approaches have been referred to as eclectic voice treatment. 1,​ 23


We will focus our subsequent discussion of voice rehabilitation on the characteristics of voice treatment programs supported by at least a minimal level of published research evidence (i.e., evidence for the program’s effectiveness exists in the peer-reviewed literature).


8.8.1 Stretch-and-Flow


Stretch-and-flow (SnF) is a voice treatment program designed to rehabilitate vocal function in the presence of a physiological imbalance within the vocal subsystems, usually when the imbalance is related to functional hyperkinetic motor execution. 24 It can be adapted for patients with vocal hypofunction, especially in speakers who have developed compensatory maladaptive hyperfunctional vocalization patterns to overcome glottal insufficiency. SnF has also been referred to as flow phonation, and was first described in the early 1980s as a program for children with functional voice disorders. 25,​ 26 Evidence for the clinical effectiveness of SnF has been demonstrated in populations with voice impairments ranging from nodules, polyps, and MTD, and has recently been investigated within the context of a randomized controlled trial. 24,​ 27


The program of SnF targets multiple direct and indirect domains. Direct domains include musculoskeletal (orofacial modification and postural alignment), respiratory (coordination and support), vocal function (glottal contact and pitch modification), auditory (sensorineural), and somatosensory (discrimination). The SnF program is typically administered along with an indirect approach focusing on the domain of pedagogy (knowledge enhancement and vocal hygiene). The delivery method is primarily extrinsic and characterized by a hierarchical structure, which is illustrated in ▶ Fig. 8.3.



The hierarchical structure of stretch-and-flow (SnF) voice treatment program.


Fig. 8.3 The hierarchical structure of stretch-and-flow (SnF) voice treatment program.


A fundamental goal of SnF is to facilitate volitional neuromotor control of the vocal subsystems while maintaining a perception of minimal effort. To accomplish this, the patient moves through a hierarchy of progressively challenging vocal tasks. SnF structure initially focuses on the respiratory domain with voiceless airflow control techniques. As the patient demonstrates mastery, airflow is combined with the vocal function domain of glottal contact by progressively increasing the degree of vocal fold adduction onto the stream of airflow, first with a breathy voice and then with a more engaged glottal contact pattern. The program culminates with mastery of coordinated neuromuscular control characterized by phonation in connected speech produced with equilibrium between respiration, phonation, and resonance.


The SnF hierarchy presents a progressive structure along the continua of muscle activity levels (x-axis of ▶ Fig. 8.3) and motor complexity stages (y-axis of ▶ Fig. 8.3). Each muscle activity level of the hierarchy is referred to as a skill, and the patient is required to progress horizontally through each skill level while working vertically through the hierarchy of motor complexity stages. The motor complexity hierarchy begins with unvoiced airflow and culminates with connected speech. When a patient masters all motor complexity stages within a skill level, they move to the next skill in the horizontal muscle activity hierarchy. The muscle activity skill levels are as follows (as defined in the study of Watts et al 24,​ 27):




  • Level 1: Flow. The goal for this skill level is for the speaker to gain control over airflow. The speaker’s task is to produce a steady flow of unvoiced air without effort through rounded lips, as in a slow, comfortable exhalation. A piece of soft-tissue paper held at the front of the mouth is used for visual feedback, with successful productions resulting in forward movement of the tissue along with patient perceptions of relaxed exhalation. The patient’s hand placed in front of the mouth can be used in lieu of the tissue, as a form of tactile biofeedback (airflow emanating from the mouth stimulates sensory receptors on the hand). When mastery is established for voiceless airflow, the clinician may choose to progress to the next skill level or another stage in level 1 (stage 1B on the y-axis of ▶ Fig. 8.3) by modeling and eliciting a breathy voiced vowel (e.g., /u/) with emphasis on the perception of effortless flow. The goal for level 1 stage B is a steady flow of air and minimally engaged vocal folds to produce breathy voice quality with minimal effort. For patients who may have difficulty comprehending the sequence of SnF, this preliminary stage can help build understanding of how airflow will be combined with voicing as the treatment progresses.



  • Level 2: Stretch-and-flow. The goal for this skill level is for the speaker to produce minimal effort, voiceless airflow through the glottis along with slow (stretched out) movements of the articulators. Accurate productions should be perceived as effortless, whispered connected speech at a slow rate. Tissue can also be used for visual feedback. Stages in the vertical hierarchy (y-axis) begin with one-word and then progress to multiple-word stimuli as the patient masters each stage. Stimuli in stages C to E in this skill level and subsequent skill levels have included numbers (e.g., “one,” “one-two,” “one-two-three”), or words beginning with /h/ or /w/ + /u/ to facilitate airflow in the context of semiocclusion. 24,​ 25,​ 27



  • Level 3: Stretch and voiced flow. The goal for this skill level is for the speaker to produce voiced airflow through the glottis along with slow (stretched out) movements of the articulators, while using minimal effort. Accurate productions should be perceived as breathy, effortless voice quality in connected speech at a slow rate of speech. Tissue can also be used for visual feedback.



  • Level 4: Reduced stretch with increased flow. The goal for this skill level is for the speaker to produce voiced airflow through the glottis with a faster speech rate, while maintaining minimal effort. Accurate productions should be perceived as effortless voiced but breathy speech with a normal rate of speech.



  • Level 5: Reduced air flow (target voice). The goal for this skill level is for the speaker to produce a normal voice quality with an appropriate rate of speech, while maintaining minimal effort. Accurate productions should be perceived as appropriate conversational loudness with normal (nonbreathy) air flow, normal speech rate, and modal (natural for that person) pitch.


The development of treatment goals for individual patients receiving SnF can be structured with percent correct benchmarks dependent on factors such as the patient’s response to diagnostic probes and their initial response to the treatment techniques. Published research has set mastery criteria at 90% accuracy at each stage within a skill level. 24 It is possible that initial criteria for a particular stage can be set lower, and the goal adjusted as the patient attains increasing degrees of mastery. For example, an initial goal of the program might be “Patient XX will produce level 1 flow productions with continuous voiceless exhalations (stage A) on 7 of 10 (70%) consecutive trials.” When the patient reaches that level of mastery, the criterion can be increased and the goal modified to 9 of 10 consecutive trials or 90% accuracy. When mastery at 90% is reached, the patient moves on to the next stage within that skill level.


8.8.2 Frontal Resonance Treatment (FRT)


The concept of directing conscious awareness to resonant sensations in the oral and nasal cavities to facilitate vocal flexibility and target voice quality has been used in singing pedagogy for centuries. Many treatment concepts utilized in vocal rehabilitation emerged from early singing voice instructors, and it is not surprising that techniques used to improve vocal control such as resonant focus are shared among SLPs and teachers of singing. Techniques which attempt to exchange muscular tension at the larynx for increased oral cavity movement, airflow, and vibratory sensations have existed in the profession of SLP since at least the early 1940s as a means of reducing or eliminating vocal hyperfunction. 28 Resonant focus is understood as the conscious perception of vibrotactile sensations in the oral and nasal cavities (e.g., frontal regions of the face) during sound production. These sensations are related to actual tissue and bone vibrations in the facial region which are elicited by vocal tract shaping during resonant voice production. 29 Clinical theory suggests that transferring a patient’s attention of effort away from the larynx and toward the resonant cavities of the upper airway has the effect of reducing excessive laryngeal motor execution, or hyperfunction. 2 This theory has been supported by published evidence which has demonstrated that glottal adduction during resonant voice production allows for normal voice quality and function with minimal tissue contact that is beneficial for patients with voice disorders related to hyperkinetic muscular control and/or phonotraumatic behaviors. 11


The framework of many physiologically based resonant voice programs is built on modifications of the work of Arthur Lessac, a renowned voice teacher who developed the “Kinesensic” approach. 30 One of the foundational components of his method is awareness of movement and energy throughout the body. The perception of oral vibratory sensations in the context of perceived easy or effortless phonation is a fundamental target of contemporary resonant voice therapy programs. 1 Two evidence-based resonant approaches used widely by SLPs include the Lessac-Madsen Resonant Voice Therapy (LMRVT) program developed by Kathrine Verdolini and the associated program of Resonant Voice Therapy (RVT) developed by Joseph Stemple and colleagues. 10,​ 31,​ 32,​ 33 These treatments or other associated adaptions of resonant focus have been used successfully in patients with vocal hyperfunction and vocal hypofunction, especially when the glottal incompetence of hypofunction is accompanied by compensatory maladaptive strain and effort.


Measurable effects of Stemple’s RVT approach and adaptions of the program have been supported by numerous published investigations. 9,​ 32,​ 34,​ 35 The physiological domains and treatment characteristics of RVT have also been applied to the previously described taxonomy of voice treatment by Van Stan et al. 8 A video example illustrating administration of the RVT approach is provided in Video 8.1 and is further detailed by Roy et al. 32 The following methodology for FRT (related but distinct from LMRVT and RVT), is derived from the authors’ clinical experiences, and is built upon decades of advancing knowledge of the physiological effects that resonant focus imparts upon the vocal subsystems. Authors and voice therapists who have advanced this knowledge include but are not limited to Emil Froeschels, Freidrich Brodnitz, Ed Stone, Thomas Cleveland, Daniel Boone, Katherine Verdolini, Joseph Stemple, and Kimberly Coker, among others. 36 Additionally, clinicians can use the process described below in a flexible and dynamic manner to fit the needs of the patient.


Fundamental Characteristics of FRT


The perceptual target of FRT is the conscious awareness of focused, oral vibratory sensations in the context of minimal effort. Although the perceptual focus of FRT is on resonance, the physiological requirements for accurate resonant productions require coordination of respiratory, laryngeal, and supralaryngeal muscles. FRT facilitates rehabilitation or enhancement of motor patterns and requires the implementation of motor learning strategies, of which copious repetition of target resonant voice production in varying speech contexts is crucial.




  • Stage 1: Establishing frontal resonance




    • Teaching the foundations of voice: The process of FRT begins with education. The clinician will introduce the patient to the anatomy and physiology of phonation in terms that the patient can understand. An emphasis on the connection between the three subsystems of voice should be provided. This emphasis will help the patient better understand how shaping the supraglottal vocal tract can influence laryngeal muscle function and subsequent phonation.



    • Exploring current function: The purpose of this step is to increase the patient’s conscious awareness of current vocal technique. The clinician begins by asking the patient to inhale slowly and deeply, and then exhale. Close attention is paid to diaphragmatic and rib cage expansion in relation to the upper torso tension. This process is repeated. Inefficient motor patterns for respiration can be identified and brought to the patient’s attention at this point, and subsequently corrected.


The patient is then asked to produce a vowel at comfortable pitch and loudness. The clinician guides the patient through the process of perceptually identifying (1) levels of effort, (2) anatomical regions of muscular tension, and (3) perceptual impressions of voice quality. The patient is also asked to describe the location at which they feel voice effort is focused—such as at the level of the chest, the throat, or higher in the oral cavity.




  • Locating vibrotactile energy: This step is designed to establish a kinesthetic sense of vibration in the facial region. Establishing a physical sensation of orofacial vibrations is a key to this step. The clinician begins by leading the patient through different voice gestures to find one or more strategies which lead to orofacial vibratory awareness and improved voice quality. Successful attempts will be characterized by improvements of voice quality within the context of reduced effort (note: it should not be harder for the patient to produce these gestures … it should be easier). Among the prompts that can be attempted to establish frontal resonance, all of which are preceded by an easy inhalation to support voicing, include the following:




    • A gentle, sustained hum at a constant pitch lasting 3 to 5 seconds—the head should be in a neutral, relaxed position.



    • Gently humming with descending pitch, as in a sigh. 1



    • Gently humming while tucking the chin to the chest.



    • While seated, leaning over with hands on knees, head toward the floor, while gently humming.



    • Turning the head to the left or right while gently humming.



    • Rounding the lips (as if whistling) while gently humming.


It is important to realize that the perceptual responses to these attempts can be variable from patient to patient. Some patients will report a strong sense of vibrotactile energy, while others may feel none. Others may report the sense of vibration within the nose, while others may report the sense at the lips, palate, or somewhere general within the oral cavity. The perception of vibrotactile energy anywhere in the oral and nasal regions is a positive predictor, and subsequent repetitions of the gesture(s) should prompt the patient to consciously focus on the feeling of energy at those regions. Mastery at this stage of the FRT protocol occurs when the patient can consistently produce repeated gestures with the following treatment targets: (1) a physical perception of vibrotactile energy in the oral or nasal region, (2) an auditory-perceptual improvement in sound quality, and (3) the perception of minimal effort when producing the voice gesture.


This stage of FRT can last as little as 10 minutes, or may require multiple treatment sessions with subsequent home practice to achieve mastery. Some patients will have more difficulty establishing the physical sense of vibrotactile energy and progress more slowly during the treatment process. Some modifications for facilitative gestures which might aid those patients can include the following 36:




  • Ask the patient to place a finger on the side of the nose while prompting them to monitor touch sense for nasal vibrations.



  • Ask the patient to gently hum through a straw held between the lips.



  • Ask the patient to produce a comfortable “ahhhh” while transitioning into a gentle hum by closing the lips.



  • Ask the patient to produce comfortable repetitions of “ma-ma-ma” or “na-na-na,” followed by prolongation of the /m/ or /n/ sound after a few repetitions.



  • Ask the patient to produce bilabial raspberries, and transition those into a gentle hum.



  • Ask the patient to gently produce a /z/ or /v/ sound while monitoring the physical sense of vibration at the point of articulation. Shape this into a gentle hum.



  • Ask the patient to mimic agreement with something by producing “Umm Hmm” slowly and repetitively with an upward pitch inflection on the “hmm.”


Mastery of stage 1 necessitates the patient being able to consistently produce the treatment targets accurately in the context of the isolated gesture(s). The clinician can choose the appropriate level of mastery (e.g., 80%, 90%, or higher) prior to advancement based on the individual needs of the patient. Once the established level of mastery is achieved, the patient moves on to the next stage.




  • Stage 2: Voice expansion




    • Gentle humming will be expanded into hum + vowel. While most vowel sounds can be used, the authors typically choose to utilize the /a/ vowel as the initial target. The patient is asked to support voice with an easy, efficient inhalation and then produce a hum for a few seconds followed by a vowel. The vowel should transition immediately from the hum (e.g., there should not be a voicing silence between them), such as “mmmmmmmmahhhhhhhhhh.” Depending on patient response in the prior stage, different facilitative gestures can be used for the onset stimulus (e.g., UmmmHmmmahhhhh; leaning over while humming; rounding the lips, etc.).



    • When the level of mastery for the vowel is achieved, the patient is next asked to expand gentle humming into hum + numbers. These productions are generated as “hmmm … one; hmmm … two; hmmm … three, etc.. The number should transition immediately from the hum. It may help the patient with perceptual focus of the targets (perceptions of effort, tension, and voice quality) if they first generate these productions with chanting (a steady pitch and loudness). When successful at chanting, they can then transition into productions with a descending intonation on the number. An alternative to numbers is to utilize single-syllable words that begin with a nasal sound (e.g., “hmmm … man”; “hmmmm … no,” etc.).



    • When the level of mastery for the numbers or nasal words is achieved, the patient is next asked to produce a gentle hum + nonnasal word. This step is similar to the steps mentioned earlier, with the exception that single-syllable word stimuli are chosen which reflect variable consonant and vowel structures. As mentioned earlier, it can benefit the patient to initially produce these stimuli with chanting, and then transition to typical speech intonation. The clinician should prepare each treatment session with copious word stimuli to ensure practice quantity and variability.



    • When the level of mastery for nonnasal single-syllable words is achieved, the patient is next asked to produce the same stimuli without the preceding hum (i.e., the hum is shaped away). It is important at this step to remind the patient of efficient respiratory support and the focus on effort, sites of tension, and voice quality. The previously identified regions of vibrotactile energy should be monitored by the patient to facilitate resonant productions. The single words produced without hum should be low loudness and generated with adequate airflow at the onset of phonation. The clinician should monitor these factors and bring to the patient’s attention if poor technique resurfaces.



    • When the level of mastery for words without the humming cue is achieved, the patient is next asked to produce sentences. As with previous stimuli, each production is generated with focus on vibrotactile sensations, sense of effort, and voice quality. It can sometimes benefit the patient to first produce stimuli while chanting, and then move to more typical speech intonation once the targets are mastered using the chant context. Sentence stimuli should be varied and numerous to provide the patient with adequate practice opportunities.



  • Stage 3: Generalization




    • Once the patient reaches mastery of the previous stage, they then move to producing the target voice with frontal resonance while reading paragraphs (e.g., from a book, magazine, or preselected passages). As with the previous stages, the patient is encouraged to consciously monitor frontal resonance, perceived effort levels, sites of tension, and voice quality during productions.



    • Once the appropriate level of mastery is achieved, the patient moves on to produce the target frontal resonance voice while engaging the clinician in conversation.



    • To facilitate carryover, the final step is for the clinician to monitor the patient while they engage in conversation with others. This can include conversations on the phone, in person with familiar individuals, and in person with those unfamiliar to the patient. Digital devices have also been developed to facilitate acquisition and maintenance of frontal resonant voice production. 37


8.8.3 Vocal Function Exercises


Voice disorders that result in or from obligatory weakness and subsequent deconditioning (e.g., vocal fold paresis and vocal fold atrophy), 17,​ 38 as well as functional voice disorders that demonstrate inefficient neuromuscular control and/or primary or compensatory muscular imbalance in the subsystems supporting voice production may have the potential for responding to exercises that facilitate neuromuscular adaptation. As with any other skeletal muscle, the laryngeal muscles should respond to the stress of physical rehabilitation by adapting to improve functional capacities. To increase muscular strength, tone, coordination, and/or endurance, a muscle must be exposed to repeated stress (e.g., a load or fatiguing task). In turn, the muscle and its neuromotor pathways will adapt to this stress through a process of physiological change. A classic example is weightlifting, where specific muscle groups are exposed to repeated stress in the form of exercise(s) presenting an overload (heavy weight). With repeated exposure to this overload, the targeted muscles adapt through a process of hypertrophy, where muscle cells gain volume and a greater number of neuromotor pathways and motor units are recruited during contraction. This adaptation makes the muscle increase in both size and ability to generate greater contraction strength. Muscle adaptation can also be responsive to long duration, repetitive motion. For example, progressive exposure to increased contraction durations is needed to allow muscular adaptions that will permit a runner to complete extended distance runs versus the reduced endurance requirements of leg muscles for sprinting. 39 Similar adaptations have been demonstrated in both dysphonic and nondysphonic speakers participating in vocal function exercise (VFE) programs. 40,​ 41,​ 42,​ 43,​ 44,​ 45,​ 46,​ 47


VFEs comprise a voice treatment program developed by Joseph Stemple which rehabilitates or enhances vocal function through repeated stress applied to the respiratory and laryngeal musculature. The program consists of a four-exercise set which is repeated four times daily, typically two repetitions of the set in the morning and two repetitions in the afternoon. The role of the SLP in administering this treatment includes skilled instruction and regular monitoring of exercise technique and progressive performance outcome. These elements are critical if VFEs are to challenge the target muscle groups in a manner that is specific to phonation for communication purposes. VFEs have been applied successfully to populations with both vocal hyperfunction and vocal hypofunction.


The exercises comprising the VFE program are illustrated in ▶ Fig. 8.4 and have been thoroughly described in previous publications. 1,​ 10 The application of VFE is demonstrated in Video 8.2. Each exercise is designed to target adaptation in laryngeal muscles specific to voice demands during communication and/or vocal performance. VFE include exercises requiring sustained, controlled contraction of the vocal fold adductor muscles (warm-up), dynamic and controlled contraction of the vocal fold tensor muscles to elongate the vocal folds (stretch), dynamic and controlled contraction of the vocal fold tensor and relaxer muscles to shorten the vocal folds (contract), and coordination of all adductor and tensor muscles during sustained phonation at different fundamental frequencies (power). Instructions for the four exercises are briefly described as follows:



Vocal function exercises.


Fig. 8.4 Vocal function exercises.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Feb 25, 2020 | Posted by in OTOLARYNGOLOGY | Comments Off on Voice Treatment: Orientations, Framework, and Interventions

Full access? Get Clinical Tree

Get Clinical Tree app for offline access