Flexible fiber-optic pediatric endoscope
Another consideration in preparation for laryngoscopy is the use of an intranasal anesthetic and/or decongestant. Using a combination spray can be beneficial to examiner and patient: it decreases pain, decreases duration of the exam, and provides a superior view [3]. After using the spray, it is best to wait several minutes prior to the exam to allow maximal benefit. Anesthetics should be used with caution, however, as they can have unwanted consequences depending on the indication for the endoscopic exam. For example, topical anesthetics are known to increase signs of laryngomalacia [4] and may influence and swallow function, although findings on this have been mixed in adults and not extensively studied in children [5–7].
There are several other non-anesthetic considerations that may facilitate a flexible endoscopic exam. These vary by patient age. For a neonate, infant, or toddler, swaddling can help. For a preschool or school-aged child, distracting them during the exam or coaching them through it (if they are amenable to that) may be helpful. Finally, an adolescent should be able to participate more actively in breathing and relaxing techniques. Positioning the patient such that they are sitting up straight, leaning forward, and slightly extending their neck (assuming the sniffing position) is also important.
- 1.
Administer topical anesthetic and position patient as detailed above.
- 2.
Insert the endoscope along the nasal floor, maintaining a straight endoscope to allow for precise manipulation.
- 3.
Once the posterior nasopharynx is encountered, instruct the patient to breathe through their nose (if they are able to follow instructions) to allow passage into the oropharynx.
- 4.
In the oropharynx, have the patient protrude their tongue to allow for better assessment of the tongue base and valleculae.
- 5.
Advance to the hypopharynx. Instruct the patient to insufflate their cheeks to provide a better examination of the pyriform sinuses.
- 6.
Assess the true and false vocal folds. Have the patient produce a sustained /i/ to evaluate mobility. Spontaneous crying will also suffice for this purpose. Instruct the patient to sniff in to elicit posterior cricoarytenoid muscle contraction and consequent vocal fold abduction.
- 7.
Advance the endoscope to the level of the vocal folds to examine the subglottis.
- 8.
Withdraw the endoscope slowly, evaluating the adenoid pad, torus tubarius, and nasal cavity.
Interpretation
More important than the technical ability required to perform flexible laryngoscopy is the interpretation of the exam. Recording the exam is ideal to allow revisiting and comparing across serial exams. The nasal cavity, nasopharynx, oropharynx, hypopharynx, and larynx can all contribute via different mechanisms to alter voice and swallow function.
Moving posteriorly to the palate, palatal mobility and velopharyngeal competence should be evaluated. Velopharyngeal insufficiency can occur in the setting of various craniofacial syndromes or rarely status post-adenotonsillectomy [9, 10]. The adenoid pad should be examined to determine the amount of obstruction. Adenoid hypertrophy can also have effects on resonance in addition to the negative consequences on eustachian tube function [11].
The supraglottis, glottis, and subglottis should be evaluated from both a functional and anatomic/structural perspective. Using laryngomalacia as an example for evaluation of the supraglottic airway, it is a pathology with both functional (mucosa overlying the arytenoid cartilages prolapsing into the airway) and structural (foreshortened aryepiglottic folds and an omega-shaped epiglottis) components [15]. From a functional perspective, at the level of the glottis, there can be a range of pathologies including incomplete glottic closure, paradoxical vocal fold motion, or vocal fold paralysis. From a structural perspective, benign vocal fold lesions or laryngeal webs/atresia may be present. The subglottis is similar to other parts of the larynx where pathologies such as subglottic hemangiomas or stenosis can contribute to symptoms on the structural side and tracheomalacia can be a factor on the functional side.
Videostroboscopy
While endoscopy under halogen light can evaluate laryngeal structure, mobility, and tissues, and identify the presence or absence of lesions or masses, it lacks the ability to evaluate the vibratory characteristics, pliability of the vocal folds, and closure pattern. The rate of vibration of the vocal folds during phonation is much faster than the human eye can distinguish. Because of this, videostroboscopy allows the evaluator to assess vibratory features through essentially taking advantage of an optical illusion created by stroboscopic light.
Videostroboscopy to evaluate the larynx was well described by Bless, Hirano, and Feder in 1987 [16] and is part of the recommended protocols for instrumental evaluation of the voice set out by the American Speech-Language-Hearing Association (ASHA) expert panel [17]. Videostroboscopy is performed using either a rigid or flexible endoscope (fiber optic or distal chip) attached to a stroboscopic light source and a video recording system [16, 18]. Recommended specifications for equipment are detailed in the recommendations of the ASHA task force [17].
Stroboscopy takes advantage of two phenomena of visual perception: a perception of a flicker-free, uniformly illuminated background (occurring at greater than 50 Hz) and the perception of apparent motion when two objects are displayed in rapid succession [18, 19]. Stroboscopy works by producing a flickering light source at a slightly slower rate than the frequency of vocal fold vibration, so that what is seen is actually a sampling of images across multiple vocal fold vibratory cycles, rather than a single cycle. Due to the mentioned visual perceptual phenomena, the observer’s eye perceives this as a continuous motion, allowing them to assess vibratory characteristics of the vocal folds. A minimum of three glottic cycles are needed to make valid perceptual judgements, with each cycle consisting of opening, closing, and closed phases [20]. Rating is not reliable with an aperiodic signal, as the light cannot sync appropriately to provide images that appear to be in immediate succession.
Instrumentation and Procedures
Stroboscopy can be performed with either a flexible or rigid endoscope. When performing rigid endoscopy, the child should be positioned in an upright position, leaning forward from their waist, with their chin up and tongue out. Very young children often have difficulties participating in rigid endoscopy, as it requires them to sit with their mouth open, their tongue out, and sustain phonation in this position. While we have sometimes had success in performing rigid stroboscopy as young as 3 years old, it is more usual for children age 5 or 6 to be able to participate. Flexible visualization requires less assistance from the child but can be more unpleasant for children because, as stated above, the passage through the nose can be slightly uncomfortable. As with halogen endoscopy, topical anesthetic and decongestant can be applied and often make the procedure more comfortable. For young children sitting on a parent’s lap can also be comforting, as well as allowing for the parent to assist with positioning. A laryngeal microphone is positioned on the child’s neck so that the stroboscopic light can sync with their fundamental frequency. Flexible endoscopes can be either fiber optic or distal chip, and imaging advances in recent years have allowed for much smaller diameters of distal chip endoscopes. Improved image quality and a smaller diameter combine to improve both patient participation and the ability to interpret stroboscopy.
Arytenoid mobility – degree of abduction and adduction, symmetry, and speed of movement
Tissue appearance
Supraglottic compression – degree of lateral or anteroposterior compression above the level of the vocal folds
Free edge contour (rated during abduction, each vocal fold rated separately)
Glottal closure (rated during modal pitch) – the degree and configuration of glottic closure during closed phase
Amplitude (rated during modal pitch, with each fold rated separately) – the magnitude of lateral movement of the vocal folds during vibration
Mucosal wave (rated during modal pitch with each vocal fold rated separately) – the magnitude of movement of the mucosa during vibration
Vertical level – the degree to which the vocal folds meet on the same plane (is one higher or lower than the other?)
Adynamic segments – are there portions of the membranous vocal fold that do not vibrate?
Phase closure – whether open or closed phase dominates or if it is equal
Phase symmetry – the degree to which the vocal folds mirror each other during vibration
Regularity/periodicity – the regularity of vibrations
- 1.
Rest breathing – three consecutive cycles
- 2.
Laryngeal diadokokinesis (ʔiʔiʔiʔiʔiʔiʔiʔi)
- 3.
/i/ – sniff or /i/ quick inhale
- 4.
Sustained /i/ at modal pitch, at least three stroboscopic cycles
- 5.
Sustained /i/ at low and high pitch, at least three stroboscopic cycles of each
- 6.
Sustained /i/ at varying loudness levels, at least three stroboscopic cycles of each
- 7.
Any additional tasks individualized to the patient’s voice complaints
Acquisition of these tasks relies heavily on the patient’s willingness to participate, which can be more of a challenge with children than adults. Every attempt should be made to help the child feel comfortable and gain their participation. In pediatric clinics and hospitals, child life specialists can be extremely helpful in making children feel comfortable and relieving some of the potential fear and stress involved.
Interpretation and Evaluation
When an adequate sample can be obtained, stroboscopy has a high level of clinical utility in evaluating the vibratory function of the vocal folds and in differentially diagnosing lesions [23–25]. Successful stroboscopy has been reported on in the literature with children as young as 3 years old [23]. Detailed evaluation may be more challenging in children than adults due to multiple factors, including relative difficulty sustaining a pitch for the required number of cycles, difficulty cooperating, and a smaller larynx. Zacharias and colleagues found that clinicians were able to identify vibratory features in 92% of stroboscopic exams in children but only confidently rate those features in 42% of exams [24]. The researchers found that raters were more able to rate the features when performed with a rigid endoscope than with a flexible scope and that older children were more able to tolerate the rigid exam than younger children [24]. As stated above, making a child more comfortable with the procedure is important not only for the child’s comfort but also in our ability to make adequate observations. As a visual perceptual measure, ratings of videostroboscopy are by nature subjective and subject to the limitations of any perceptual measure. Relatively few studies using stroboscopy as an outcome measure have reported on interrater reliability, and of those that have, many are low [26, 27]. Ratings are dependent on the skill and experience of the rater, as well as their rigor in applying those skills. Efforts have been made over the years to standardize evaluation procedures and ratings in order to be more consistent across raters and clinics, and there are multiple rating forms available for use in evaluating stroboscopic images [16, 21, 26, 28, 29]. The Voice-Vibratory Assessment with Laryngeal Imaging (VALI) form (Fig. 14.5) provides a rating system for both stroboscopy and high-speed digital laryngeal imaging of the larynx [21]. Consistent use of the same methodology across raters, as well as regular practice and training, should improve reliability and clinical accuracy of ratings.
High-Speed Videoendoscopy
Videostroboscopy, the current gold standard in laryngeal imaging, is designed to evaluate periodic vibrations of any nature [16, 22]. In order to obtain reliable and valid visual perceptual judgments of vocal fold vibratory motion from videostroboscopy, a steady-state phonation of at least 2–3 s [20] from which three consecutive glottal cycles [30] can be viewed is required. In the pediatric population, it is often difficult to obtain steady-state phonation of a minimum of 2–3 s with either a rigid or flexible videostroboscopy due to examination factors of ease and cooperation. Other factors such as moderate and severe overall auditory perceptual impairment of voice quality typically also result in short phonations of less than 2 s, resulting in tracking errors on videostroboscopy [31]. The presence of tracking errors renders the exam clinically invalid for documenting the vibratory features of amplitude, mucosal wave, periodicity, glottal closure, etc. [30]. High-speed videoendoscopic systems are able to capture cycle-to-cycle vocal fold vibratory motion for phonations less than 2 s due to the high-temporal resolution of up to 8000 frames per second. In contrast with high-speed videoendoscopy, videostroboscopy is able to provide an averaged vibratory motion at 30 frames per second. The sampling rate of high-speed videoendoscopic systems is fast enough to also capture transient events of oscillatory onset, oscillatory offset, and voice breaks.
Instrumentation and Procedures
Since its first report in 1940 [32], high-speed videoendoscopy systems have undergone substantial modifications making the once impractical research tool now clinically feasible.
High-speed videoendoscopic systems have similar appearance to the videostroboscopy systems but differ substantially in terms of its basic principle and playback capabilities. Like videostroboscopy, simultaneous acoustic and various other signals (e.g., electroglottography, electromyography, etc.) can be captured with high-speed videoendoscopic recordings. However, unlike videostroboscopy, high-speed videoendoscopic recordings do not provide simultaneous playback of the video and audio. Slow video playback rates ranging from 10 to 30 frames per second are required to view and evaluate the high-speed videos captured at high-temporal resolutions of up to 8000 frames per second. Due to the current technological limitations, playback of audio simultaneously with the slow playback of the high-speed videos is not possible. The spatial resolution of high-speed videoendoscopy is generally lower (512 × 256 pixels) compared to videostroboscopic systems which can range from 720 × 468 for standard digital videostroboscopic systems to 1920 × 1080 pixels for high-definition videostroboscopic systems. As is evident high-definition videostroboscopy is not similar to high-speed videoendoscopy as the former has high spatial resolution but is still lower in terms of the temporal resolution compared to high-speed videoendoscopy. Because high-speed videoendoscopic systems allow for the capture of cycle-to-cycle variations of vibratory motion due to its increased temporal resolution, high-speed videoendoscopy was reported to take less time (2.31 ± 1.92 min) compared to videostroboscopy (2.95 ± 2.41 min) for evaluation of vocal fold vibratory features in adolescents [25]. Common commercially available high-speed videoendoscopy systems are able to record phonations for up to 10 s requiring multiple recordings to capture the range of tasks required to evaluate the vocal fold structure and function. High-speed videoendoscopic systems also require a strong light source of 300 watts; hence care must be taken to turn the light source down between recordings to prevent any heat-related side effects from overheating of the tip of the endoscope. Because high-speed videoendoscopic systems differ in terms of the basic principles compared to videostroboscopy, considerable training is required for its use.
Core tasks and measures similar to those for videostroboscopy can be used for clinical examination with high-speed videoendoscopy. The use of tasks and procedure for videostroboscopy recommended by the American Speech-Language Pathology (ASHA) task force [30] is an ideal place to start as these tasks can also be used for high-speed videoendoscopy. The basic recommended protocol of rest breathing, laryngeal diadochokinetic tasks /iʔ iʔ iʔ iʔ/, and maximum vocal fold adduction and abduction(/i:/-sniff, /i:/-sniff) can be used for evaluation of vocal fold edges, vocal fold mobility, and the maximum range of vocal fold mobility at the level of the arytenoids [30]. The tasks of sustain phonation of /i:/, sustained /i:/ at varied pitch and loudness levels, and [5] variations in pitch and loudness on sustained /i:/ that elucidate the patients’ problem can used to evaluate the vocal fold function features of supraglottic compression, regularity, amplitude, mucosal wave, glottal closure, left/right phase symmetry, vertical level, and glottal closure duration [30]. Often high-speed videoendoscopy is used in conjunction with videostroboscopy clinically rather than in isolation, especially in instances where videostroboscopy results in tracking errors due to short phonation time. Since high-speed videoendoscopy is often used in combination with videostroboscopy, the clinician may choose to limit high-speed videoendoscopy to the evaluation of vibratory function only, thereby reducing the overall time required for the clinical exam.
Evaluation
The vibratory motion obtained from high-speed videoendoscopy can be evaluated both quantitatively and qualitatively. Currently, quantitative tools for evaluating vibratory motion have not attained widespread utility as the custom-developed software systems are not readily available and often too laborious for routine clinical use. Qualitative visual perceptual evaluation of vocal fold structure and function is routinely used in clinic. The Voice-Vibratory Assessment with Laryngeal Imaging (VALI) form for visual perceptual evaluation of vocal fold structure and function can be used for both videostroboscopy and high-speed videoendoscopy (Fig. 14.5) as the VALI rating form was developed a prior for reliable visual perceptual ratings of vocal fold structure and vibratory characteristics for videostroboscopy and high-speed videoendoscopy [21]. The VALI visual perceptual rating form has improved graphics and definition of each parameter to aid the clinician for improved reliability in rating the laryngeal imaging features of interest [21].
Summary of differences in vibratory characteristics in typically developing children, adult females, and adult males without dysphonia