Voice Disorders

Key Points

  • Evaluation and management of voice disorders in children may be more challenging because of their inability to cooperate, lack of awareness of the problem, and lack of motivation to change.

  • Otolaryngologic evaluation of children with voice disorders begins with a detailed history, physical examination, flexible laryngoscopy, and videostroboscopy.

  • Voice evaluation by a speech pathologist should include perceptual, acoustic, and aerodynamic analyses.

  • Voice therapy plays an essential role in management of pediatric voice disorders and should be used liberally both preoperatively and postoperatively.

  • Organic voice disorders that require primarily surgical management include vocal fold paralysis, laryngeal web, posterior glottic stenosis, recurrent respiratory papillomatosis, and tumors. Vocal fold granulomas typically require medical management; vocal nodules, cysts, polyps, and sulcus are initially managed with voice therapy but may require surgery. Voice disorders may also be functional, without a primary anatomic abnormality.

  • Surgery, especially in very young children, should be undertaken with caution.

Voice production starts with the expiration phase of respiration, which provides the air pressure to vibrate the vocal folds. The lowest periodic component of vocal fold vibration is termed the fundamental frequency, and it is perceived as pitch. Harmonics are integer multiples of fundamental frequency in voiced sounds; the energy or intensity in the harmonic components decreases as frequency increases.

Sound waves generated by vocal fold vibration are modified by the size, shape, and tension of the resonating chamber, which consists of the oropharynx, the nasopharynx, and the nasal cavities. The resonance of the vocal tract is termed the formant . Although an infinite number of formants exist, only the first four formants—F1, F2, F3, and F4—are of clinical interest; the lowest frequency formant is F1, and each formant is characterized by its center frequency and bandwidth. Constriction of the vocal tract near a volume velocity maximum or minimum can, respectively, lower or raise the formant frequency. The resulting sound wave is further modified by the articulators (lips, teeth, tongue) and results in voice and speech production. A normal voice should be pleasing in quality and should have an appropriate balance of oral and nasal resonance, intensity, fundamental frequency level, and prosody (rhythm, stress, and intonation). Voice disorders can result in a voice that is unpleasant for the listener or that may interfere with effective communication.

The underlying cause of a voice disorder may be organic or functional. Organic voice disorders result from congenital or acquired anatomic abnormalities. Functional disorders are caused by emotional or psychological problems but can lead to anatomic alterations. However, even when a voice disorder results primarily from an organic cause, often a psychological overlay is present.

Although the reported incidence of voice disorders in children varies greatly, most voice surveys of children show a 6% to 9% incidence of voice disorders. Voice disorders are categorized depending on the area of the problem: voice quality, resonance, loudness, and pitch. This classification is arbitrary, and a voice disorder often has several problem areas. Any anatomic abnormality that involves the free edge of the vocal fold can affect voice quality and result in harshness, breathiness, or hoarseness. Disturbances in resonance may be manifested by hypernasality or hyponasality. Intensity problems occur when a child speaks too loudly or too softly. Deviations in pitch occur with speaking at an abnormal fundamental frequency, narrow pitch range, or excessive pitch breaks. An algorithm for the evaluation and management of voice disorders is outlined in Figure 24-1 .


Algorithm for evaluation and management of voice disorders in children. CXR, chest radiography study; MRI, magnetic resonance imaging; Rx, therapy.


The evaluation of a voice disorder in a child requires a systematic approach. Additional evaluation by specialists from a variety of disciplines—including a pediatrician, pulmonologist, gastroenterologist, psychologist, and social worker—may be necessary. A detailed history that includes medical, birth, growth and development, and speech and language information is obtained, followed by a detailed voice history to determine the cause of the voice disorder and its contributing factors, potentially including any pulmonary disease. Typically, voice history should include a description of the disorder, time of onset, any known causes, severity of the disorder, exacerbating or alleviating factors, voice use history, and any previous history of voice or speech therapy.

Social and functional impact of voice impairment can be evaluated using rating scales such as the Pediatric Voice Outcomes Survey, Pediatric Voice-Related Quality-of-Life Survey, and Pediatric Voice Handicap Index. These rating scales are designed to provide physicians with the parents’ perception of the severity of the child’s voice disorder and its impact on the child’s daily life; they are used to follow the child’s progress before and after therapy and surgical intervention.

Physical examination should concentrate on the head and neck areas. The ears are examined for evidence of previous and current ear disease; a nasal examination should reveal any deviation of the septum and turbinate abnormalities, and an oropharyngeal examination should focus on the structural integrity and mobility of the soft palate, followed by flexible fiberoptic nasopharyngolaryngoscopy. Flexible fiberoptic endoscopy allows determination of adenoid size, assessment of velopharyngeal function, and evaluation of any supraglottic and glottis pathology.

Stroboscopic examination, which uses a short burst of light in synchrony with vocal fold motion, allows more careful examination of the vocal folds and their motion by seemingly slowing down the movement of the vocal folds. It is particularly useful in differentiating superficial from deep lesions. Laryngeal stroboscopy can delineate vocal fold symmetry, periodicity, vibratory amplitude, mucosal wave, glottal closure, and rigidity, but it is not possible in many children. Bouchayer and Cornut found that stroboscopic examination in children often had to be quick, therefore their results tended to be inconclusive. Hirschberg and colleagues found that stroboscopy was possible only in children older than 6 or 7 years. McAllister and others completed stroboscopic examination in only half of 60 patients aged 10 years and older; however, the newer digital flexible endoscopes may allow stroboscopic examination in younger children. In our voice clinic, digital flexible stroboscopy has been used successfully in children as young as 3 years. Although stroboscopic examination can be instrumental in accurate diagnosis of a voice disorder, rigid endoscopy under anesthesia may be necessary in some children to establish a diagnosis.

The child’s voice is evaluated by a speech-language pathologist, usually in a voice laboratory. A typical voice evaluation consists of perceptual, acoustic, and aerodynamic analyses carried out as the child performs several speaking tasks. There are no universally accepted standards for speech and voice evaluation, and the speaking tasks, speech parameters measured, and methods of measurement vary greatly.

Several speaking tasks are commonly used. The first is oral reading, which is possible only in older children. Children who can read at third-grade level or better are given a specific passage to read; younger children are allowed to select the reading material. The second task is conversational speech or connected speech of at least a 1-minute duration. The child is asked to tell a story about a picture or talk about a specific topic (e.g., pets, a vacation, a hobby). Based on these two tasks, the speech-language pathologist can perceptually and acoustically analyze the child’s speech for most voice parameters, including prosody.

The remaining speaking tasks are designed to evaluate more isolated aspects of the child’s voice. The third task consists of counting from 1 to 10, 60 to 70, and 90 to 100. The counting is done carefully, once slowly and then as rapidly as possible. This is repeated for three levels of loudness (soft, average, loud) and pitch (low, modal, high). This task reveals any problems with laryngeal tone, resonance, pitch, and loudness.

The fourth speaking task is the production of isolated speech sounds. This consists of sustaining certain vowel sounds (e.g., /a/, /i/) for at least 5 seconds and repeating them three times to determine laryngeal tone, resonance, and fundamental frequency. The child should then sustain /a/ at a comfortable pitch and loudness for as long as possible after a deep inspiration, and the child should repeat this three times with a break in between each attempt to determine the maximal phonation time and whether the child had adequate respiratory drive to maintain continuous voice. This same procedure is repeated with /s/ and /z/. In a normally functioning larynx, the s/z ratio should be close to 1. However, a lesion in the vocal fold margins (e.g., vocal fold nodules) increases the amount of airflow and decreases the time on /z/, resulting in a ratio of 1.4 or more 95% of the time.

Finally, the examiner should instruct the child to repeat certain consonants, words, and short sentences to assess the child’s articulation abilities and speech intelligibility. The words and sentences are typically chosen to bring out specific problems such as velopharyngeal insufficiency.

The speech samples are analyzed, and a voice profile can be then developed using scales such as the Buffalo III voice profile. This voice profile classifies voice abnormalities as follows: 1, normal; 2, mild; 3, moderate; 4, severe; or 5, very severe. It evaluates laryngeal tone, pitch, loudness, nasal resonance, oral resonance, breath supply, muscles, voice abuse, rate, speech anxiety, speech intelligibility, and overall voice rating. Other perceptual scales such as Grade, Roughness, Breathiness Asthenia Strain (GRBAS) and Consensus Auditory-Perceptual Evaluation-Voice are also used in the evaluation of pediatric voice.

Aerodynamic analysis provides objective measures of velopharyngeal and vocal fold function. The oral-nasal acoustic ratio and palatal efficiency rating computed instantaneously have been used to evaluate velopharyngeal function. Laryngeal airway resistance can be measured to assess the effective closure of the vocal folds to airflow. This is measured noninvasively using an anesthesia mask, a pressure-sensing catheter, and a flow-sensing pneumotachometer. Values less than 30 cm H 2 O/L/sec indicate inadequate closure, whereas values greater than 60 cm H 2 O/L/sec are associated with hyperkinetic voice disorders. Normative aerodynamic measures for children aged 6 to 10 years are also available. These techniques of aerodynamic analysis are being replaced by computer-assisted voice analysis programs.

Computer-assisted analysis of voice disorders was introduced in 1990 and is being used with increasing frequency. This technology presents the opportunity to supplement perceptual evaluation of voice disorders and has replaced many traditional methods of evaluation. Evaluation of naturalness and intelligibility of speech still requires the human ear.

Computer-assisted voice analysis can provide the mean fundamental frequency, intensity, and amplitude of voice based on a small voice sample (0.5 second) for comparison with existing normative values for age and sex ( Figs. 24-2 and 24-3 ). Until recently, normative data from adult studies in the literature were used because of inavailability of normative data for children ; however, a pediatric normative database for computer-assisted voice analysis has since been established.


Voice tracing of a normal voice sustaining a prolonged vowel: mean fundamental frequency, 332.5 Hz; jitter, 0.36%; shimmer, 1.9%; harmonic/noise ratio, 12.37 dB.


Voice tracing of a hoarse voice (caused by vocal nodules) sustaining a prolonged vowel: mean fundamental frequency, 220.60 Hz; jitter, 2.26%; shimmer, 4.34%; harmonic/noise ratio, 7.84 dB.

Other information provided by computerized voice analysis includes harmonics/noise ratio, amplitude perturbation (shimmer), frequency perturbation (jitter), and electroglottography. Aerodynamic measures can be obtained in less than 60 seconds using the pneumotachographic mask. Parameters that can be measured include subglottal pressure; transglottal airflow; oral pressure; nasal flow; airflow; resistance and efficiency; inspiratory, expiratory, and pause aerodynamics; and nasal and velopharyngeal resistance.

Voice Therapy

There are important differences between voice therapy in children and in adults. Children often do not have the insight that their voice needs to be modified, which influences the motivation to change. Therefore, the therapist must work to increase the child’s awareness of his or her vocal behaviors that will require change.

Voice therapy entails two stages. The first stage consists of 10 exploratory sessions, each lasting 35 to 40 minutes. This stage helps determine the goals and specific procedures to be used in stage 2, which consists of regular voice therapy sessions for 2 to 5 months. The frequency of therapy is determined by the severity of the dysphonia. Therapy should be supplemented by practice at home to hasten resolution of dysphonia. The total therapy duration is approximately 4 to 5 months; improvement or resolution of dysphonia should be significant.

During the initial phase of voice therapy, the mechanisms of voice production and voice problems are explained to the child in simple terms, and a list of rules regarding good and bad voice are provided. A main goal of therapy is to eliminate vocal abuse by decreasing the total amount of talking; however, even in highly motivated children, total voice rest may not be feasible. Listening training and auditory feedback are essential in voice therapy; in order to correct a voice disorder, the patient should learn to differentiate normal and abnormal voices for comparison with his or her own voice.

Problematic muscular tonus, loudness, pitch, and rate require therapy. These areas are often closely interrelated and should be managed simultaneously. The child is given tools to correct a problematic voice parameter, which include 1) knowledge of the correct rules of specific voice parameters; 2) identification of incorrect and correct voice habits in others; 3) recognition of personal use of incorrect voice and modification of this habit; 4) recognition of personal use of correct voice; and 5) recognition of situations that cause personal use of poor voice habits and good voice habits. These result in an increase in the amount of time that correct habits are used, and these steps can be applied to any problematic voice parameter.

Voice production depends on the well-coordinated movement of muscles involved in phonation. Hyperfunction or excessive muscular tonus frequently occurs in children with benign laryngeal pathology, whereas hypofunction with flaccid muscular tonus occurs in those with functional dysphonia. For both problems, control of muscular tonus and proper positioning of the laryngeal, pharyngeal, and oral structure should be taught. To correct hyperfunctional states, posture instruction, breathing exercises, relaxation procedures, muscle tension reduction techniques, chewing methods, muscle stretching exercises, and biofeedback can be used. Wilson found that the chewing method and progressive relaxation are particularly useful in reducing muscular tension. For hypofunctional states, the pushing method increases muscular tension.

Excessive loudness of voice is accompanied by high pitch, rapid rate of speaking, and hyperfunctional state. Therefore it is often necessary to manage these problems simultaneously. Eliminating loud speaking is particularly important in children with vocal nodules. Correct training for loudness, pitch, and rate involves teaching the child to listen and to monitor various voice parameters. Often, modifying loudness lowers pitch, and attention to loudness and pitch normalize the rate of speaking.

Functional Etiology

A functional voice disorder is diagnosed when no anatomic or organic cause can be found. Functional dysphonia is categorized as disturbance of mutation, psychological dysphonia, imitation, or faulty learning. Mutation is the change of voice that occurs during puberty. Pitch lowers in males and, to a lesser extent, in females. Mutation can be delayed, prolonged, or incomplete. High pitch, hoarseness, and voice breaks are characteristic. Mutational voice disorders can result from endocrine pathology.

Functional dysphonia that results from psychological causes seldom occurs in children; only isolated case reports can be found in the literature. The underlying psychological problems are related to or are part of tensional symptoms, adjustment, anxiety, or personality disorders. Functional dysphonia may be a form of conversion hysteria. The disorder may be complete aphonia or partial loss of voice. The dysphonia is often variable, with effortful voice production and easy fatigue. Laryngeal examination may show ventricular band approximation, bowed vocal folds, or hypoadducted vocal folds (hysterical aphonia). Vocal fold movement is normal with inhalation and cough.

Children may also imitate the speech productions of others with speech disorders, for example, those related to cleft palate or hearing impairment. Imitation may occur only in certain contexts, whereas faulty learning implies that the child applies those speech patterns to all communicative contexts. For example, a child may learn to talk louder than normal because of the presence of a hearing-impaired person in the household. In adults, several approaches have been used to manage functional dysphonia that include behavioral therapy, hypnosis, speech therapy, psychotherapy, and a combination of speech therapy and psychotherapy. In children, the optimal management is unknown, but psychotherapy or psychological counseling is often carried out in conjunction with voice therapy.

Organic Etiology

Resonance Disorders

Resonance disorders include hypernasality and hyponasality. Hypernasality is usually caused by velopharyngeal insufficiency from underlying palatal abnormalities. Hyponasality can result from any underlying condition that causes nasal or nasopharyngeal obstruction. The underlying pathology may be choanal atresia, deviated nasal septum, turbinate hypertrophy, nasal polyps, or, most frequently, adenoid hypertrophy. In performing adenoidectomy, particular attention should be paid to the structural integrity of the palate to decrease the incidence of postoperative velopharyngeal incompetence. Accurate diagnosis and appropriate medical and surgical management of velopharyngeal dysfunction are further discussed in Chapter 9 .

Vocal Quality Disorders: Surgical Management

Vocal Fold Paralysis

Congenital vocal fold paralysis results from birth trauma and congenital anomalies of the central nervous system and the heart and great vessels. Any infant or child with vocal fold paralysis should be evaluated with imaging of the chest and central nervous system. Vocal fold paralysis is the second most common cause of congenital stridor in children and represents 10% of congenital anomalies of the larynx. The prognosis for spontaneous recovery is better for acquired, right-sided, and unilateral paralysis.

More than 50% of vocal fold paralysis in children is bilateral. Because arytenoid fixation can be mistaken for bilateral vocal fold paralysis, the cricoarytenoid joint should be palpated at the time of rigid endoscopy. Laryngeal electromyography (EMG) may be the most specific and sensitive test to determine the presence of vocal fold paralysis. In children, laryngeal EMG is usually done intraoperatively. More than 50% of the time, bilateral vocal fold paralysis requires tracheotomy for the establishment of an airway; however, the voice is often normal. Although spontaneous recovery of vocal fold function is possible after 2 to 3 years, late recovery is often incomplete because of laryngeal muscle atrophy, synkinesis, and cricoarytenoid fixation. If vocal fold function does not return spontaneously after 10 to 12 months, surgery to permit decannulation of the child should be considered.

Surgical options to correct bilateral vocal fold paralysis consist of reinnervation with a nerve-muscle flap, cordotomy, lateralization procedures (e.g., arytenoidopexy), arytenoidectomy through a posterolateral external approach, arytenoidectomy through laryngofissure, or endoscopic arytenoidectomy. Reinnervation of the posterior cricoarytenoid muscle is not universally successful, although Tucker reported it to be the management of choice in children. Because of the small size of the laryngeal structure, endoscopic techniques can be more difficult and less successful in children. Cordotomy, a procedure in which the membranous vocal fold is sectioned from the vocal process of the arytenoid, has limited use in children and may be most useful as an adjunct to other procedures. Narcy and colleagues found Woodman’s procedure to have a higher failure rate and recommended arytenoidopexy by an external posterolateral approach. However, Bower and colleagues recommended arytenoidectomy through a laryngofissure because it provided better exposure, better control over the final fold position, and a high rate of success (84%). Although most patients have adequate voice postoperatively, breathiness, hoarseness, and pitch change are seen, and patients may require voice therapy. The resulting voice disorder is inversely proportional to the adequacy of the airway.

Unilateral vocal fold paralysis rarely requires airway intervention and often goes unrecognized until the child is older. The voice in unilateral vocal fold paralysis is hoarse, weak, and breathy. It usually improves spontaneously over 6 to 12 months by contralateral vocal fold compensation; recovery may be hastened by the use of voice therapy. However, in a few patients, persistent problems with dysphonia or aspiration will require surgical intervention. Surgical options include vocal fold injection, surgical medialization, and reinnervation. Surgical interventions should be done in conjunction with preoperative and postoperative voice therapy.

Polytef injection immediately improves voice; however, it is irreversible and changes vocal fold vibratory characteristics and thus results in poor vocal quality. Furthermore, in children, determining the amount of Polytef to inject is difficult because general anesthesia is often necessary, and airway obstruction is possible because of the small size of the larynx. Levine and colleagues recommended injection of an absorbable gelatin sponge, which is similar to polytef injection except that its effects are temporary. Fat injection appears to be well tolerated by the body, does not cause the vocal fold to become stiff, and is not absorbed extensively. Recently, other injection materials such as AlloDerm and calcium hydroxyapatite have become available, but they have not yet been widely used in children.

Several surgical techniques are available for medialization of the vocal fold. Isshiki type I thyroplasty is theoretically reversible and does not change the vibratory characteristics of the vocal fold; however, it does not restore tensioning capability of the vocal folds and requires external incision and a temporary tracheotomy in most cases. In children, this procedure is technically more difficult and may cause airway compromise. Experience with thyroplasty in children has been limited because of a lack of knowledge about the effect of this technique on thyroid cartilage growth. Gray and colleagues recommended thyroplastic operations only in patients with a mature larynx because of the risk of fixation of the distance between the arytenoid and thyroid cartilages. Link and others have reported their experience with thyroplasty in children aged 2 to 17 years and have recommended a modified surgical approach in children to compensate for the lower position of their vocal folds. Although voice quality is improved on objective measurement, the resulting voice is not ideal.

Selective reinnervation of the adductors of the larynx does not compromise potential spontaneous recovery, nor does it necessitate tracheotomy or preclude eventual use of other techniques; it restores tensioning capability, thereby providing better pitch control. However, it is an open procedure, and motion may not be observed for up to 6 months. Excellent results have been reported by Crumley and Tucker. Tucker advocated it to be the procedure of choice in children. More recently, Sipp and colleagues have reported good results in pediatric patients with injection laryngoplasty, thyroplasty, and reinnervation.

Laryngeal Web

Smith and Caitlin reported that glottic web and atresia account for 5% of congenital anomalies; however, some argue that the true incidence of congenital web of the larynx is higher or lower. A web occurs because of the epithelium, which temporarily obliterates the developing laryngotracheal lumen and fails to reabsorb during the eighth week of embryogenesis. Glottic webs are classified depending on their severity. Type I is an anterior web that involves 35% or less of the glottis. The true vocal folds are visible within the web, and little or no subglottic extension is apparent. Although there is usually no airway obstruction, voice dysfunction is common. Type II is an anterior web that involves up to 50% of the glottis ( Fig. 24-4 ). The true vocal folds are usually visible within the web, and subglottic involvement is minimal. Voice disorder is the common presenting symptom. However, airway compromise may occur with upper respiratory tract infections. Type III involves up to 75% of the glottis ( Fig. 24-5 ), and the anterior portion of the web is solid and extends into the subglottis. Most of the true vocal folds are visible within the web. Airway obstruction and voice disorder are moderately severe. Type IV involves up to 90% of the glottis, and the web is uniformly thick and extends into the subglottic area with resulting subglottic stenosis. Infants with this type of web are aphonic with severe airway compromise.


Type 2 laryngeal web.


Type 3 laryngeal web.

In 1985, Cohen and others reviewed 51 cases of children with laryngeal webs and recommended that surgery be tailored according to web severity. A type II web was divided with a knife, microsurgical scissors, or laser followed by dilations. Types III and IV were managed by tracheotomy and insertion of a laryngeal keel through a laryngofissure. The anterior commissure is not adequately reconstructed by any surgical procedure, and the resulting voice continues to be abnormal and requires vocal rehabilitation. An anterior glottic web may be associated with velocardiofacial syndrome.

Recommended management of a thin laryngeal web involves endolaryngeal division of the web with a knife or a CO 2 laser with or without temporary placement of a keel to prevent readhesion. Thick glottic webs are approached through a precise midline thyrotomy, a laryngofissure with removal of excess tissue under direct vision using a fiberoptic laryngoscope, and placement of a mucosal graft fixated with fibrin glue or stenting. The use of CO 2 laser in cases of a thick web is not recommended. The resulting voice is reportedly satisfactory, but objective postoperative analysis of voice outcomes have not been reported.

When laryngeal web is associated with subglottic stenosis, laryngofissure with anterior cartilage graft and stenting is required. Children who undergo laryngotracheal reconstruction or cricotracheal resection to address their subglottic stenosis, regardless of whether an associated laryngeal web is present, are at risk for poor voice outcome. The severity of subglottic stenosis and glottic involvement influences the voice outcome. Abnormal voice quality is secondary to anatomic changes and is described as dysphonia marked by harshness, whisper, ventricular phonation, and inappropriate pitch.

Posterior Glottic Stenosis and Cricoarytenoid Joint Fixation

Posterior glottic stenosis in children can be congenital (e.g., caused by interarytenoid web or cricoarytenoid fixation). More commonly, posterior glottic stenosis results from airway trauma from intubation ( Fig. 24-6 ). Bogdasarian and Olson classified posterior glottic stenosis into four types. This classification was later modified for the pediatric population by Irving and associates. Type I is vocal process adhesion, type II is posterior commissure or interarytenoid scar, type III is congenital or acquired unilateral cricoarytenoid fixation with or without interarytenoid scar, and type IV is congenital or acquired bilateral cricoarytenoid fixation with or without interarytenoid scar.

Jul 15, 2019 | Posted by in OTOLARYNGOLOGY | Comments Off on Voice Disorders

Full access? Get Clinical Tree

Get Clinical Tree app for offline access