The evaluation of the dysphonic patient begins with a complete understanding of the laryngeal anatomy and physiology of voice production. A thorough history must be taken regarding the dysphonia qualities, alarming symptoms, and confounding factors. The complete head and neck examination culminates in a detailed visualization of the vocal folds using image-capturing laryngoscopy as well as stroboscopy or high-speed digital imaging to fully evaluate the viscoelastic properties of the vocal fold cover-body structure and function. Finally, the evaluation leads to the biopsy of any concerning lesions either under magnification in the operating room or topical anesthesia in the office.
Key points
- •
Evaluation of the dysphonic patient begins as soon as the clinician can hear the patient’s voice. This evaluation involves a thorough history, head and neck examination, a perceptual evaluation of the voice, and a detailed assessment of the patient’s laryngeal anatomy and function.
- •
Dysphonia results from a disruption in the anatomy and function of the vocal folds. Stroboscopy is critical in evaluating the function of the vocal fold vibratory characteristics.
- •
High-speed digital imaging also can play an important role in patients with aperiodic vocal fold vibration.
- •
Concerning lesions warrant a biopsy for pathologic diagnosis.
- •
In the operating room, a telescope or microscope provides optimal visualization, mapping ability, and tactile evaluation of the tissue. Additionally, in-office biopsy is a cost-reducing and effective alternative in select patients for pathologic sampling.
CAPE-V | Consensus Auditory Perceptual Evaluation of Voice |
DL | Direct laryngoscopy |
F o | Fundamental frequency |
GRBAS | Grade, roughness, breathiness, asthenia, strain |
HSDI | High-speed digital imaging |
NBI | Narrow Band Imaging |
SLP | Speech-Language Pathologist(s) |
VHI | Voice Handicap Index |
V-RQOL | Voice-related quality of life |
Introduction
Dysphonia is defined as an impairment of the speaking or singing voice, and it affects up to one-third of people during their lifetime. The evaluation of dysphonia by general otolaryngologists varies with different practice patterns depending on training background, practice type, and available resources. There are multiple techniques to visualize the larynx from mirror laryngoscopy to high-speed digital imaging (HSDI). Some practices frequently employ speech-language pathologists (SLPs) to assess the perceptual, aerodynamic, and acoustic measurements, as well as treatment counseling and therapy. This lack of consensus in approach to dysphonia contributed to the Academy of Otolaryngology–Head and Neck Surgery to develop clinical practice guidelines on dysphonia. There are both benign and malignant factors that can cause dysphonia, but what is concerning is that up to 52% of patients with laryngeal cancer thought their hoarseness was harmless, leading to a delay in evaluation and treatment. The guidelines list comorbidities that should trigger a patient and clinician to suspect a serious underlying cause of the dysphonia ( Box 1 ). This article builds on the invaluable article by Blitzer, elsewhere in this issue, regarding laryngeal anatomy and function, and presents a laryngologist’s focus on the different tools, highlights, and pitfalls in the evaluation of the dysphonic patient.
Conditions leading to suspicion of a “serious underlying cause”
Hoarseness with a history of tobacco or alcohol use
Hoarseness with concomitant discovery of a neck mass
Hoarseness after trauma
Hoarseness associated with hemoptysis, dysphagia, odynophagia, otalgia, or airway compromise
Hoarseness with accompanying neurologic symptoms
Hoarseness with unexplained weight loss
Hoarseness that is worsening
Hoarseness in an immunocompromised host
Hoarseness and possible aspiration of a foreign body
Hoarseness in a neonate
Unresolving hoarseness after surgery (intubation or neck surgery)
Introduction
Dysphonia is defined as an impairment of the speaking or singing voice, and it affects up to one-third of people during their lifetime. The evaluation of dysphonia by general otolaryngologists varies with different practice patterns depending on training background, practice type, and available resources. There are multiple techniques to visualize the larynx from mirror laryngoscopy to high-speed digital imaging (HSDI). Some practices frequently employ speech-language pathologists (SLPs) to assess the perceptual, aerodynamic, and acoustic measurements, as well as treatment counseling and therapy. This lack of consensus in approach to dysphonia contributed to the Academy of Otolaryngology–Head and Neck Surgery to develop clinical practice guidelines on dysphonia. There are both benign and malignant factors that can cause dysphonia, but what is concerning is that up to 52% of patients with laryngeal cancer thought their hoarseness was harmless, leading to a delay in evaluation and treatment. The guidelines list comorbidities that should trigger a patient and clinician to suspect a serious underlying cause of the dysphonia ( Box 1 ). This article builds on the invaluable article by Blitzer, elsewhere in this issue, regarding laryngeal anatomy and function, and presents a laryngologist’s focus on the different tools, highlights, and pitfalls in the evaluation of the dysphonic patient.
Conditions leading to suspicion of a “serious underlying cause”
Hoarseness with a history of tobacco or alcohol use
Hoarseness with concomitant discovery of a neck mass
Hoarseness after trauma
Hoarseness associated with hemoptysis, dysphagia, odynophagia, otalgia, or airway compromise
Hoarseness with accompanying neurologic symptoms
Hoarseness with unexplained weight loss
Hoarseness that is worsening
Hoarseness in an immunocompromised host
Hoarseness and possible aspiration of a foreign body
Hoarseness in a neonate
Unresolving hoarseness after surgery (intubation or neck surgery)
History
Often clinicians have a referring diagnosis or physical examination finding from a colleague to guide an evaluation. However, referring diagnoses are commonly inaccurate, and a myopic evaluation can overlook findings that may influence treatment. The differential diagnosis for dysphonia is extensive ( Box 2 ), and a thoughtful history can direct testing, treatment, and prognosis counseling. Therefore, the first part of any evaluation is a thorough and complete history. Elements include standard concerning signs and symptoms regarding pain, weight loss, and neck masses. Inquiring about cigarette smoking and alcohol consumption are crucial, as both are well-established risks factors for both laryngeal cancer and dysphonia. With the dysphonic patient, additional details must be investigated. History of prior intubations, neck surgery, radiation, or trauma can all affect vocal quality. Upper aerodigestive tract diseases, such as gastroesophageal reflux disease, asthma, allergies, sinusitis, and inhaler use, can all contribute to dysphonia. A surgeon must consider the larynx in terms of its 3 main roles: voice, breathing, and swallowing. Evaluating one without the others can lead to incomplete decision-making.
Differential diagnosis dysphonia
Laryngitis
Chronic
Viral
Bacterial
Fungal
Allergic
Reflux
Sicca
Benign lesions
Polyp
Varices
Cyst
Pseudocyst
Nodules
Granuloma
Reinke edema
Reactive lesion
Hemorrhage
Laryngocele
Airway stenosis
Web
Scar
Presbylarynges/Atrophy
Muscle tension dysphonia
Neurologic and neuromuscular
Vocal fold paralysis
Vocal fold paresis
Spasmodic dystonia
Tremor
Clonus
Cerebral vascular accident
Parkinson disease
Amyotrophic lateral sclerosis
Myasthenia gravis
Recurrent respiratory papillomatosis
Leukoplakia
Neoplasm
Verrucous
Squamous cell
Granular cell
Metastatic
Systemic disease
Rheumatologic lesions
Systemic lupus erythematosus
Sarcoidosis
Granulomatosis with polyangitis (Wegener)
Amyloidosis
Tuberculosis
Hypothyroidism
Syphilis
Polychondritis
Laryngeal trauma
Cricoarytenoid dislocation
Cricoarytenoid fixation
Vocal fold tear
Similar to the evaluation of pain, a surgeon must evaluate all of the aspects associated with the dysphonic complaint, including onset, duration, quality, frequency, severity, and alleviating and aggravating factors. Gradual onset can often imply a functional dysphonia or a small growing lesion. Abrupt onset is often associated with a hemorrhagic polyp or an acute injury. An onset associated with endotracheal tube intubation or anterior neck or thoracic operation could implicate an injury to the cricoarytenoid joint or vagus/recurrent laryngeal nerves. Persistent dysphonia after an upper respiratory tract infection could suggest a nerve paresis or residual inflammatory changes.
When a physician asks a patient about his or her voice, the common response is “it’s just hoarse” or “it doesn’t sound like my normal voice.” Asking the patient to describe the nature of the vocal complaints without using the term “hoarse” can elucidate what it is exactly about the voice that is causing distress. Descriptors such as effortful, weak, short of breath, sore, lower pitch, or poor clarity are all part of a medical vernacular that a surgeon can use to narrow the differential diagnoses. A “breathy” or “effortful” voice is often due to incomplete closure of the vocal folds during phonation. Specific examples might include unilateral vocal fold paralysis or paresis, fixation from tumor invasion of the thyroarytenoid muscle or cricothyroid joint, vocal fold atrophy in presbylarynges, or a large posterior granuloma impeding closure. A “rough” or “strained” voice might be attributable to irregular vocal fold oscillation during phonation due to glottic asymmetry. Pathologies that could lead to roughness include muscle tension dysphonia, vocal fry, benign vocal lesions, anterior glottic web, neoplasm, leukoplakia, or vocal fold scar.
It is useful to evaluate the variability of the dysphonia and its pattern throughout the day. A voice that is consistently dysphonic with little fluctuating could implicate a constant anatomic abnormality. If a patient describes a period of normalcy surrounded by the dysphonic voice, this could represent a fluctuating nonorganic etiology, such as muscle tension dysphonia. An overall decline raises concern for a progressing neoplasm.
It is important to inquire about voice demands at work as well as previous vocal training. This may uncover compounding factors, such as fume or allergen exposure, or unhealthy vocal demands at a construction site, hair salon, or a large classroom. A patient’s vocal experience also can guide the specificity of pretreatment counseling on vocal expectations, treatment decisions, and the extensiveness of posttreatment voice therapy.
A thorough history includes a review of past medical history, operations, and medications. Progressive neurologic diseases, such as tremor, Parkinson disease, and amyotrophic lateral sclerosis, all involve the larynx and voice in different forms. Upper and lower airway disease could lead to chronic inflammation from post nasal drip, productive pulmonary secretions, or chronic throat clearing. Previous anterior cervical surgeries, such as anterior cervical discectomy and fusion, thyroidectomy, or carotid endarterectomy, all place the recurrent laryngeal nerve and external branch of the superior laryngeal nerve at risk for injury with subsequent dysphonia. Previous intubations can lead to paresis, paralysis, arytenoid immobility, or granuloma. Medications can produce a rough voice, as is often a result from the drying effects of diuretics. Angiotensin-converting enzyme inhibitors have the well-known side effect of cough, which can lead to chronic irritation and dysphonia.
Office examination
Performer and music teacher Manuel Patricio Garcia of Spain first introduced the mirror laryngoscopy. It was subsequently adapted to the medical profession with modifications by Turk and Czermak. A thorough mirror examination has often been the combination of art and science, and culminates as a rite of passage or clinical diagnostic skill exclusive to otolaryngologists. With the tongue drawn forward, the patient phonates “e” to elevate the larynx and protrude the base of tongue anteriorly for optimal viewing of the endolarynx. Although the mirror provides a convenient and inexpensive examination, there are well-known limitations. It lacks magnification for finer lesions, anterior glottic visualization can be difficult, and the inability to record examinations limits patient education and review capabilities. The mirror laryngoscopy also precludes performing most voice tasks during the examination.
Rigid endoscopy is performed by placing a Hopkins rod through the mouth to the posterior oropharynx. Topical anesthesia may be needed to reduce the gag reflex. The Hopkins rod has an alternating air-lens system that permits transmission of the image from the distal to the proximal end with minimal distortion. The distal end of the rod has an angled lens, typically 70° or 90°, to permit a view of the inferiorly situated larynx. The endoscope is coupled to a high-intensity light source while relaying a magnified image to the proximal eyepiece. The rod also can be coupled to a stroboscopic light source or a high-speed digital camera (see later in this article). The naked eye can be used through the eyepiece, or a video-capturing device can be attached to the eyepiece for recording and reviewing of images later. The technique is similar to the mirror examination in that the patient is in the seated sniffing position with the tongue drawn forward. The rigid endoscope enables the physician to acquire high-quality images and detect the subtlest abnormalities. Similar to the mirror examination, the rigid examination limits the voice tasks, such as connected speech, because the device is occupying the oral cavity.
In a direct comparison between mirror laryngoscopy and rigid angled indirect laryngoscopy, the use of the rigid endoscope produced significantly less gagging and pain for the patient. In addition, the rigid endoscope provided a more complete examination when compared with the mirror laryngoscopy, especially of the anterior glottis.
Fiberoptic flexible laryngoscopy involves using a small-diameter cable with optical fibers that transmit both light from the proximal source and the image from the distal target. Similar to the rigid endoscope, the eyepiece may be used with the naked eye or connected to an image-capturing device for recording and review. Unique to the flexible scope is that it passes through the nasal cavity, nasopharynx, and oropharynx, limiting the gag reflex. Topical nasal anesthesia is sometimes used for this examination, although its use has not been demonstrated to significantly reduce patients’ discomfort. Initially, these scopes were plagued with poor light intensity and image quality, making them substandard to rigid endoscopic examination. However, flexible endoscopes with small cameras placed at the distal tip of the scope have enhanced image resolution that is of essentially the same quality as rigid endoscopes. The advantages of flexible laryngoscopy include the ability to evaluate the palatal function, as well as a full range of connected speech, breathing, and singing tasks. The smaller sizes also allow pediatric laryngeal evaluation. Regular white light laryngoscopy is excellent at identifying lesions, but lacks the ability to assess the vibratory characteristics. Fortunately, stroboscopy also can be performed via flexible examination.
Stroboscopy uses a strobe light source synchronized to the vibratory frequency of the vocal folds to provide an image that appears to be still or in slow motion, depending on the settings selected. This uses the phenomena of flicker-free perception of light and the apparent motion from individual images. As mentioned, either rigid or flexible endoscopes can be used to perform stroboscopy. The technique relies on the ability to capture the frequency of the patient with either a microphone or electroglottographic transducer and synchronize it with that strobe light source. Exact frequency synchronization provides what appears as a still image from a single point in the vibratory cycle. Quasi-synchronization (1–2 Hz above frequency) provides an image that appears to be in continuous motion of successive points in the vibratory cycle, giving the appearance of a slow-motion vibratory cycle. This feature allows detailed evaluation of the vibratory cycle and the viscoelastic properties of the mucosal vocal folds as the body-cover relationship can be inspected for any alteration. Vocal fold vibratory characteristics have been demonstrated to be critical in evaluating voice disorders, as the use of videostroboscopy leads to a change in treatment decisions in 14% to 33% of patients. As mentioned, videostroboscopy relies on the ability to capture and synchronize with the patient’s frequency. Patients with aperiodic vibratory cycles are unable to be synchronized with stroboscopy. Stroboscopy is ineffective at evaluating voice onset and offset, as a small amount of time is needed to synchronize before image production.
There are different aspects of the vocal folds that can be evaluated with videostroboscopy. One of the most recent and popular rating systems listed here provides a framework by which examinations can be evaluated in a systematic manner ( Fig. 1 ). The rater is asked to evaluate amplitude, vertical level match, mucosal wave, and the nonvibratory portion of the musculomembranous portion of the vocal fold. Supraglottic activity can be measured in a medial-to-lateral and an anterior-to-posterior compression pattern. The vocal fold edge is evaluated by both smoothness and straightness. Phase closure rates the percentage of time the vibratory cycle is in the closed or open phase, whereas phase symmetry rates the percentage of time the mucosal vibration is in symmetric phase. Regularity measures the percentage of time that one vibratory cycle is like the next. Glottal closure describes the quality or characteristic of the glottal closure pattern.
Videostroboscopy allows for a more detailed functional and anatomic evaluation of the vocal folds and their vibratory viscoelastic characteristics. Stroboscopy can reveal benign, premalignant, and malignant epithelial changes; however, it has been unable to consistently differentiate among these lesions based on vibratory characteristics alone.
High-speed digital imaging may have utility beyond stroboscopy in the evaluation of vibratory properties of the vocal folds ( Fig. 2 ). It uses a rigid endoscope similar to rigid videostroboscopy, but instead of giving the illusion of glottal cycle frame-by-frame examination, HSDI captures images at the rate of 2000 to 5000 frames per second. This allows for onset and offset examination, as well as patients with an aperiodic vibratory cycle or frequency fluctuation. Based on computer memory limitations, only approximately 2 to 8 seconds of phonation is recorded; however, this provides thousands of images. The frequency speed of image capturing can be increased as can the use of color imaging instead of black and white. However, both come with the trade-off of image quality. HSDI can be expensive and time-consuming; however, the quality of images and universal application to aperiodic voice pathologies make it clinically useful in certain scenarios.