of Phoniatrics

Fig. 1.1

Adolph Kussmaul

The new medical field was at first denominated as voice and speech pathology (Stimm- und Sprachheilkunde). The closest approximation, however, to the present term ‘phoniatrics’ was found in the term ‘phoniatros’ (1886), the telegram address of the London laryngologist Morell Mackenzie (1837–1892).

Voice and speech pathology was initially developed from two centres: Berlin and Vienna. Albert Gutzmann (1839–1910), a highly motivated teacher of the deaf in Berlin, also worked with speech/language impairment, particularly stuttering. He organised courses and edited a journal of medicine and pedagogy (Medizinisch-pädagogische Monatsschrift) as of 1891 together with his son Hermann, then a medical student.

In 1905, Hermann Gutzmann (1865–1922) (Fig. 1.2) completed his Ph.D. thesis on ‘Respiratory Movements in their Relation to Speech/Language Disorders’ and gave the probative lecture at the Medical Faculty of the Berlin Kaiser-Wilhelm-University on ‘Speech/Language Disorders as a Topic of Clinical Education’. With his pioneering inauguration, he established medical Voice and Speech Pathology as an academic discipline and made the Berlin Charité Hospital the cradle of phoniatrics.


Fig. 1.2

Hermann Gutzmann Sr.

International students worldwide flocked to Berlin to study under Hermann Gutzmann. Thirteen books and more than 300 articles offer evidence of his scientific achievement (complete bibliography in Wendler 1980). His main work, ‘Sprachheilkunde’ (Gutzmann 1912), was standard reference of the discipline for many years. The Berlin school of phoniatrics was based on natural sciences, physiology and phonetics; its students were known as the ‘organists’.

In contrast, the Vienna school led by Gutzmann’s student Emil Fröschels (1884–1972) (Fig. 1.3), as of 1909, emphasised the psychological basis, and its students were tipped as the ‘psychologists’ (Fröschels 1913). Being a Jewish scientist, Fröschels was expelled from his academic position. He emigrated from Austria to the United States in 1939 where he continued his work in St. Louis and in New York for many more years and, very successfully, held in high esteem all over the world owing to his outstanding achievements.


Fig. 1.3

Emil Fröschels. With kind permission from Josephinum, Ethics, Collections and History of Medicine, MedUni Vienna

The internist Kussmaul had demonstrated multiple close relations between speech and language disorders with neurology and psychiatry and detailed the cerebral origins of language and speech. Both Gutzmann and Fröschels attached their departments to otolaryngology with the more peripheral structures and functions in focus, covering the fields of voice, speech/language and hearing, without ignoring the central functions. This latter tradition is still alive in several areas and corresponds to a communicative approach.

1.1.3 After the Second World War

After the Second World War, with large areas of Europe in ruins, Prague assumed the leadership in phoniatrics. Miloslav Seeman (1892–1975) (Fig. 1.4), a student of Gutzmann, succeeded here in 1967 in establishing the first University-Clinic for Phoniatrics, and young students from across the world met there for advanced studies in the field. These students included many Germans of the post-war generation who rediscovered their nation’s contributions to the field and were able to re-establish phoniatric competence in Germany: good reason for them to be very grateful for this guidance and friendship offered by the colleagues of the Prague school under Miloslav Seeman and Eva Sedláčková (1913–1976). In 1958, phoniatrics was established as an official subspeciality to ENT in Czechoslovakia, a model later on for the further development in Europe. Seeman’s textbook Poruchy détski reči (Language Disorders in Children), 1955, seven editions, translated into German, French and Russian, contributed essentially to shaping the phoniatric profile in post-war Europe (Seeman 1955). The same is true for Richard Luchsinger (1900–1993) and Gottfried (Godfrey) Arnold (1914–1989) with their textbook from 1948 and 1959 that in 1970 was extended to two volumes as Handbuch der Stimm- und Sprachheilkunde (Luchsinger and Arnold 1970) and also appeared in English. All of them were students of Hermann Gutzmann, and they followed his ideas in the same way that Karl Wilhelm Weinberg (1862–1935) did in Sweden, where phoniatrics achieved the acknowledgement of a medical speciality of its own standing as early as in 1931 owing to the activities of Bertil Borg (1894–1931) and Bertil Kågen (1905–1978) and supported by the holder of the first professorial chair in ORL in Sweden, Gunnar Holmgren (1875–1954). In Finland (phoniatrics became an independent speciality in 1948), it was Rauha Hammar (1878–1964) together with Lennart Sjöström, in Switzerland, Max Nadoleczny (1874–1940) and in Poland Wladyslaw Ołtuszewski (1855–1922) (Wendler 1980). Besides this so-called German-speaking group, there was a very active ‘francophone group’ led by Jean Tarneaud (1888–1972), France, and completed by Bernard Vallancien (1907–1980), France; Jean-Claude Lafon (1922–1998), France; Jordi Perelló (1918–1999), Spain; Lucio Croatto (1920–2001), Italy; and André Muller (1918–2015), Switzerland (Perelló 1977). Regrettably, there was little if any contact between the two groups, even after the edition of Folia phoniatrica, the pioneering international journal of phoniatrics, by Luchsinger, Seeman and Tarneaud in 1947, with contributions in English, French and German. Meantime, quite a number of phoniatric textbooks have appeared in several European languages; only a few of them can be quoted here (Böhme 2001, 2003; Friedrich et al. 2013; Hirschberg et al. 2013; Obrębowski and Tarkowski 2003; Pruszewicz 1992; Schindler and Schindler 2001; De Vincentiis 2001; Vasilenko 2002; Wendler et al. 2005). With his ‘Lexicón de Comunicologia’ (Perelló 1977), Perelló provided a multilingual dictionary comprising relevant terms of the discipline in Spanish, French, English, German, Catalan, Italian and Latin as well as biographical essentials of outstanding historical personalities.


Fig. 1.4

Miloslav Seeman (from Sedláček E, Sedláček K (1973) Zum 80. Geburtstag von Prof. Dr. Miloslav Seeman. Folia Phoniatr Logop 25:1–8 with permission from S. Karger AG, Basel)

In post-war Germany, it was Peter Biesalski (1915–2001) who in 1969 opened in Mainz the first German University-Clinic for Communication Disorders. His domain was pedaudiology. Together with Gerhard Kittel (1925–2011) (Erlangen), Oskar Schindler (Torino) and Dušan Cvejić (1923–1998) (Belgrade), he founded in 1971 the Union of the European Phoniatricians, UEP (Fig. 1.5).


Fig. 1.5

The initiators of the UEP. Left to right: Gerhard Kittel, Peter Biesalski, Oskar Schindler, Dušan Cvejić

This became, mainly owing to the untiring efforts of Biesalski and Kittel, an extremely effective organisation, bringing together not only the two groups mentioned above but offering a channel for permanent contacts among people, even from the two sides of Europe divided by the iron curtain and the cold war. Annual congresses were organised, the venues of which alternated regularly between Western and Eastern Europe with a special highlight: the Gutzmann Anniversary in East Berlin in 1980 under the heading ‘75 Years of Phoniatrics’. In a Festschrift, the history and the present state of phoniatrics from 21 countries could be presented (Wendler 1980), and a Gutzmann-Medal was awarded to internationally leading personalities for the first time (Fig. 1.6).


Fig. 1.6

Awarding the Gutzmann-Medal, Berlin 1980, the laureates. Left to right: N.M. Kotby (Egypt), N. Isshiki (Japan), J. Hirschberg (Hungary), M. Hirano (Japan), L. Handzel (Poland), B. Fritzell (Sweden), T. Frint (Hungary), F. Frank (Austria), L. Dmitriev (Soviet Union), D. Cvejić (Yugoslavia), O. Caprez (Switzerland), L. Croatto (Italy), O. von Arentsschild (Western Germany), P. Biesaslski (Western Germany), C.I.E. Jansen (the Netherlands, hidden by J. Wendler, at the desk, laudator), G. Kittel (Western Germany), I. Maximov (Bulgaria), J. Perelló (Spain), E. Loebell (Western Germany), A. Pruszewicz (Poland), K. Sedláček (Czechoslovakia), C. Siegert (Eastern German), A. Sonninen (Finland), F. Šram (Czechoslovakia), R. Tostmann (Eastern Germany), H. Lindholm (Sweden). Not in the picture: H. von Leden (USA), W. Pfau (Eastern Germany)

The structure and content of the field of phoniatrics were defined and determined through close cooperation among several partners, of especial importance are the European Union of the Medical Specialists (UEMS) with Willy Wellens representing the UEP in the beginning and the International Federation of Oto-Rhino-Laryngological Societies, IFOS.

The UEP has launched numerous programmes to shape and define phoniatrics further as the medical speciality for communication disorders and to develop programmes to train and educate competent phoniatricians. A first draft was published by Wendler and Wellens in 1983 (Wendler and Wellens 1983). Within the EU, the harmonisation of such programmes is continuously advancing, after Christiane Neuschaefer-Rube (Germany) currently with Tamer Abou-Elsaad (Egypt) and Tadeus Nawka (Germany) representing phoniatrics within the framework of UEMS with a well-elaborated training programme and logbook (Vilkman et al. 2010, updated 2018), and the European concept of phoniatrics attracts increasing attention worldwide.

Under the Standing IFOS Committee on Phoniatrics and Voice Care (Chair J. Wendler), the special profile of phoniatrics has been generally acknowledged (International Federation of Oto-Rhino-Laryngological Societies 1986). This Committee can be traced back to the Committee on the Care of Voice established in 1969 by the pioneer of phonosurgery, Hans von Leden. In 1993, IFOS recommended that selected phoniatric topics be included in postgraduate ENT training programmes as a basic requirement for their completion (International Federation of Oto-Rhino-Laryngological Societies 1993).

An interdisciplinary organisation, the International Association of Logopedics and Phoniatrics, had been founded in Vienna on the initiative of Emil Fröschels as early as in 1924 (Perelló 1982). He originally named the medical field of speech/language pathology ‘Logopedics’. Hugo Stern and Miloslav Seeman later introduced the term Phoniatrics, which is in common use today to describe communication medicine, whereas the term Logopedics denotes the corresponding non-medical speciality.

1.1.4 Present and Future

Since the 1960s, phoniatrics has extended its scope from the above-outlined concept of physiological and psychological aspects of voice, speech/language and hearing to an all-encompassing perspective of communication including all input, central and output functions as well as sociocultural and ecological dimensions. As the primary function of the articulatory system, swallowing has also been included in the competence of the field. Regarding aetiological studies, molecular genetics has already contributed essential insights, particularly in the field of hearing and developmental language disorders, and as far as stuttering is concerned, genetic factors are being explored with encouraging perspectives. Neurosciences, especially in terms of neurolinguistics, are opening up new ways to the understanding and management of central language processing by means of functional imaging technologies. As the medical speciality for communication disorders, phoniatrics is a worldwide issue today, although with significant geographical differences. The status of phoniatrics varies, in a global view, from an independent speciality on its own to a rather unknown peculiarity, whereas in continental Europe, the cradle of phoniatrics, the speciality is generally well established.

According to an international inquiry in 2012 (Wendler 2012), there were some 1200 specialists in the field: 300 in Italy, 290 in Germany, 210 in Poland, 96 in Czechoslovakia and altogether some 100 university departments. According to a survey from 2016 (Antoinette am Zehnhoff-Dinnesen et al. 2016) we got data about colleagues active in phoniatrics concerning the following countries: 40 in Austria, 10 in Belarus, a couple of dozen in Belgium, 120 in the Czech Republic, many hundreds in Egypt, 23 in Finland, 319 in Germany, 23 in Hungary, 150 in Mexico, more than 200 in Poland, 150 in Russia, 13 in Saudi Arabia, about 100 in Spain, 32 in Switzerland, about 20 in the Netherlands, 15 in Turkey and 135 in Venezuela, in total more than 1650.

According to that survey phoniatrics is an independent specialty in Finland, Germany, Italy, Poland, Egypt, Mexico and Venezuela. It is an officially recognised subspeciality to ENT in many other countries. In several countries, hearing-impaired children are cared for through pedaudiology as an integrated part of phoniatrics. In others, this is a special area of audiology. Considerations to bring phoniatrics and audiology together in terms of a speciality ‘communication medicine’ are being discussed.

For the near future, when rules and regulations for medical specialisation regarding professional profiles and official recognition can be expected to be continuously under discussion, successful cooperation is of greatest importance between UEP with their untiring past president Antoinette am Zehnhoff-Dinnesen (Germany), because of her outstanding merits in rebuilding and further developing the UEP appointed honorary president in 2018, with her inspiring successor Ahmed Geneid (Finland), and with the phoniatric representatives within UEMS. An eminent milestone on the way towards a high level standard of the discipline in all of Europe was the foundation of the European Academy of Phoniatrics in 2013, initiated and finally well established after sustained multiple efforts by Antoinette am Zehnhoff-Dinnesen as the founding director. Christiane Neuschaefer- Rube was elected first president of the academy, mean-time followed by Tadeus Nawka (Germany).

In spite of differing concepts of formal professional formats and independently from systematic orders, the medical challenges of the information age require the general adoption of a recognised special medical field with encompassing compe tence for communication disorders, and that is phoniatrics.

1.2 Developmental and Anatomical Background of Communication and Swallowing Disorders

Rolf Dierichs

1.2.1 Embryology Cranium and Face

Normal Craniofacial Development (Figs. 1.7, 1.8, 1.9 and 1.10)

(Kliegman and Nelson 2007) The human skull comprises three components of different origin: the chondrocranium, which forms from three parasagittal cartilages and three sensory capsules via endochondral ossification; the membrane (dermal) bones, ossifying directly from mesenchyme of the skin; and the branchial skeleton of the pharyngeal arches, forming via endochondral ossification. The parasagittal cartilages form the base and median elements of the skull, and the primitive sensory capsules are the origins for elements of the nose, orbit and temporal bone.


Fig. 1.7

Facial development, day 24, day 33, day 48 (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.8

Sagittal section through the head; the nasal septum has been removed. Week 5, week 6, week 7 (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.9

Frontal section through the head, weeks 6–12, fusing of the maxillary shelves with the nasal septum (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.10

Roof of the oral cavity, weeks 6–12, demonstrating the developing palate (from Moore and Persaud 2003, courtesy of Elsevier)

The membrane bones of the human skull include the cranial vault (calvaria) and the bones of the face. The bones of the calvaria are separate at birth but will fuse to form sutures, and the fontanelles between these bones will join later after the brain has finished growing.

From week 4 to week 10, the face develops from five facial swellings: paired maxillary swellings, paired mandibular swellings and an unpaired medial frontonasal process. The maxillary swellings enlarge in the fifth week; they lengthen medially and form the primordia of the cheeks and the lateral portions of the upper lip. The lateral portions of the maxillary and mandibular swellings fuse to produce the final shape of the mouth. The mandibular swellings enlarge to form the primordia of the lower lip and jaw in the fourth and fifth weeks. The buccopharyngeal membrane, which separates the ectodermal stomodeum from the endodermal foregut, breaks down on day 24.

In the fifth week, ectodermal thickenings, called nasal placodes, appear on the frontonasal process, which will give rise to the nose and philtrum. Each placode develops a nasal pit in its centre. In the sixth week, its lateral edge, the lateral nasal process, will form the sides of the nose; its medial rim, the medial nasal process, will fuse with its contralateral partner to form the bridge of the nose. During the seventh week, the inferior portion of the fused material forms the intermaxillary process that will join the maxillary swellings to form the philtrum of the upper lip.

The nasal pits enlarge and fuse to form the nasal sac, with the nasal fin developing from its floor to separate the nasal and oral cavities. The nasal fin thins to form the oronasal membrane. It finally ruptures, forming an opening into the oral cavity, called the primitive choana. The primary palate grows posteriorly from the intermaxillary process as a ridge to form the floor of the primitive nasal cavity.

In the eighth week, a pair of palatine shelves initially grows inferiorly from the maxillary swellings into the oral cavity, on either side of the tongue. The shelves rotate horizontally in the ninth week and fuse medially to form the secondary palate. The anterior portion of the secondary palate ossifies to form the hard palate, while muscles of the soft palate develop in its posterior portion. Meanwhile, the nasal septum grows inferiorly from the roof of the nasal cavity, fusing with the top of the hard palate to form two nasal passages that communicate with the pharynx through the definitive choanae.

Malformations of Lips and Palate (Figs. 1.11, 1.12 and 1.13)

Cleft lips occur about once in 100 births. Males dominate by 60–80%. The clefts vary from small notches in the red of the lip to larger gaps, including the floor of the nose and the alveolar process of the maxilla. They may appear uni- or bilaterally and are caused by a failure of the maxillary swelling and the nasal prominence to merge. The median cleft lip (Figs. 1.7, 1.8, 1.9, 1.10, 1.11, 1.12 and 1.13a) is an extremely rare malformation, probably induced by a deficiency of mesenchyme and an incomplete fusion of the medial nasal processes.


Fig. 1.11

Development of a cleft lip: (a, c, e, g): week 5, week 6, week 7, week 10, foetus with a complete unilateral cleft lip. (b, d, f, h): Horizontal section through the upper lip (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.12

Various forms of split palate: (a) Normal development. (b) Split uvula. (c) Unilateral cleft of the secondary palate (posterior cleft palate). (d) Bilateral cleft palate. (e) Complete unilateral cleft of the lip, the maxillary process and cleft between the primary and secondary palate. (f) Complete bilateral cleft of lip and maxillary process with continuation between the primary and secondary palate. (g) Complete bilateral cleft lip, cleft between primary and secondary palate and unilateral cleft of the secondary palate. (h) Complete bilateral cleft lip, cleft between primary and secondary palate and bilateral cleft of posterior palate (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.13

Rare congenital anomalies of the face: (a) Median cleft of the upper lip. (b) Median cleft of the lower lip. (c) Bilateral oblique facial clefts and complete bilateral cleft lip. (d) Macrostomia. (e) Microstomia and singular nostril. (f) Split nose and incomplete cleft lip (from Moore and Persaud 2003, courtesy of Elsevier)

A split palate may occur solely or combined with a cleft lip. The cleft may be limited to the uvula but may extend across the soft and hard palate. The reason lies in an insufficient generation of mesenchyme, resulting in a disturbed fusion of the lateral maxillary shelves with the nasal septum and the posterior edge of the primary palate. Pharyngeal Arches, Clefts and Pouches

During early development, five pharyngeal (branchial) arches are generated, which appear as bar-like ridges on the ventrolateral surface of the head and neck region. They are covered by ectoderm and are separated from each other by invaginations called pharyngeal clefts. The pharyngeal clefts have counterparts on the interior in the form of endoderm-lined pharyngeal pouches. Ectoderm and endoderm are isolated by a mesodermal core. Pharyngeal membranes separate the clefts from the pouches (Graham 2001).

The pharyngeal arches are numbered 1, 2, 3, 4 and 6; they develop in cranio-caudal sequence with the first pair appearing on day 22, the second and third pairs on day 24 and the fourth and sixth pairs on day 29. Each pharyngeal arch contains an arch cartilage, an arch artery, a mesodermal component as precursor for muscles and a specific cranial nerve.

The first branchial arch is divided into a maxillary and a mandibular process; the former develops to the palatopterygoquadrate bar cartilage, which will become the greater wing of the sphenoid and the incus; the latter contains Meckel’s cartilage, a precursor of the malleus and the fibrous core of the mandible. The jaws mainly consist of membrane bones formed by direct ossification; the maxillary process gives rise to the upper jaw, the maxilla, the zygomatic and the temporal squama, and the mandibular process generates the lower jaw.

The second arch cartilage, Reichert’s cartilage, forms the stapes, styloid process, stylohyoid ligament and parts of the hyoid. The third arch cartilage also contributes to the hyoid; the fourth and sixth arch cartilages form the larynx; and the epiglottis arises in the location of the fourth arch (Fig. 1.14).


Fig. 1.14

Branchial arches, their innervation by cranial nerves and the definitive structures to which they develop (from Moore and Persaud 2003, courtesy of Elsevier)

The first arch is innervated by the trigeminal nerve, the maxillary swelling by V2 and the mandibular swelling by V3. The second arch is innervated by the facial nerve (VII), the third arch is innervated by the glossopharyngeal nerve (IX), the fourth arch is innervated by the superior branch and the sixth arch is innervated by the recurrent laryngeal branch of the vagus nerve (X).

The following Table 1.1 summarises the derivatives of the five pharyngeal arches:

Table 1.1

Branchial arches and their derivatives









Muscles of mastication

Tensor tympani

Tensor v. palatini


Ant. belly of digastric

Ant. Ligament of the malleus

Sphenomandibular ligament

Auditory tube

Tympanic cavity



Styloid process

Hyoid bone, minor horn, upper part of body

Mimic muscle system



Post. belly of digastric

Styloid ligament

Lining (crypts) of the palatine tonsils (lymphatic follicles have mesodermal origin)


Hyoid, major horn, lower part of the body



Lower parathyroid


Fourth/X (sup.)

Cartilages of the larynx


All muscles of the pharynx

(except the stylopharyngeus)

All muscles of the soft palate

(except the tensor v. palatini)

Upper parathyroid gland

Telopharyngeal body

C cells of thyroid

Sixth/X (rec.)

Cartilages of the larynx

All intrinsic muscles of the larynx except the cricothyroid Development of the Larynx

Normal Development (Fig. 1.15)

The respiratory system is an outgrowth of the primitive pharynx. Between the 20th and the 26th days of gestation, a ventral laryngotracheal groove in the primitive foregut differentiates into the laryngeal sulcus and the respiratory primordium. The tracheo-oesophageal folds between these tubular hollows later fuse to form the tracheoesophageal septum, separating the laryngotracheal groove from the foregut. From now on the foregut is divided into the ventral laryngotracheal tube and the dorsal oesophagus.


Fig. 1.15

Stages of laryngeal development. (a) Week 4. (b) Week 5. (c) Week 6. (d) Week 10 (from Moore and Persaud 2003, courtesy of Elsevier)

The larynx develops from the fourth and sixth branchial arches. The laryngotracheal opening lies between these two arches. The internal lining of the larynx originates from endoderm, whereas cartilages and muscles emanate from mesenchyme. The mesenchyme proliferates rapidly, and the sagittal slit of the laryngeal orifice changes into a T-shaped opening by the growth of three tissue masses: one is the hypobranchial eminence, which later becomes the epiglottis. The second and third growths are two arytenoid precursors. They grow between the fifth and seventh week, resulting in a temporary occlusion of the lumen. Recanalisation occurs by the tenth week and produces a pair of lateral recesses, the laryngeal ventricles that are bounded by folds of tissue that differentiate into the false and true vocal cords. Failure to recanalise may result in atresia, stenosis or web formation in the larynx.

The development of the larynx begins with the appearance of the mesenchymal-arytenoid swellings from the sixth branchial arches on the 32nd day of gestation on both sides of the opening of the laryngotracheal tube. These swellings approach each other in the midline and converge at the caudal end of the hypobranchial eminence to convert the vertical laryngotracheal opening into a T-shaped aditus. Midline compression of the tube by these swellings results in the fusion of the epithelial lamina, thereby closing the tube from the pharynx. If the closing does not occur, a posterior laryngeal cleft can result leading to severe aspiration in the newborn. The arytenoid swellings differentiate into the arytenoid and corniculate cartilages and the primitive aryepiglottal folds.

The epiglottal and cuneiform cartilages are formed by the hypobranchial eminence. Chondrification of both fourth branchial arches gives rise to the thyroid cartilage, whereas the cricoid cartilage derives from the chondral tissue of the sixth branchial arch. The laryngeal lumen obliterates to give rise to the epithelial lamina. The larynx recanalises by the tenth week of gestation.

The intrinsic muscles have gained their shapes and positions by the 40th day of gestation, and by the end of the eighth week, all components of the larynx are present including innervation and blood supply.

During the foetal period, the vocal processes develop from the arytenoids, and the thyroid cartilage laminae fuse in the midline. The epiglottal cartilage matures between the fifth and seventh months. During this period, the corniculate and cuneiform cartilages become evident. The foetal period ends with the cricoid cartilage changing from interstitial to perichondrial growth.

Malformations of the Larynx (Figs. 1.16, 1.17 and 1.18)

From the location of laryngeal malformations (Sidrah et al. 2007), one discriminates between supraglottal, glottal and subglottal anomalies.


Fig. 1.16

Laryngomalacia: (a) anterior prolapse. (b) Posterior prolapse (from Rutter and Dickson 2014, courtesy of Elsevier)


Fig. 1.17

(a) Differences between laryngoceles (bd) and saccular cysts (e, f) (from Rutter and Dickson 2014, courtesy of Elsevier)Fig. 1.17 (b) Computerised tomographic view of a patient with combined laryngocele. Prof. Dr. Haldun Oguz, personal archive photo, with permission


Fig. 1.18

Four types of laryngeal cleft: supraglottal interarytenoid cleft, partial cricoid cleft, total cricoid cleft and laryngo-oesophageal cleft (from left to right) (from Rutter and Dickson 2014, courtesy of Elsevier)

The most abundant congenital anomaly of the larynx is the laryngomalacia, accounting for more than a half of all cases (Ahmad and Soliman 2007). The ratio between males and females is about 2:1. It is classified as Type 1, Type 2 or Type 3 on the basis of patterns of supraglottal collapse. In Type 1 laryngomalacia, redundant supraglottal mucosa prolapses; Type 2 is characterised by shortened aryepiglottic folds; and Type 3 displays posterior displacement of the epiglottis coincident with a deformation, due to an imbalance in its development. The epiglottis develops from the cartilages of the third and fourth branchial arches, and an overgrowth of the third arch portion results in an omega-shaped organ. In addition, an arytenoid prolapse may result from immature neuromuscular control.

The second-most common congenital laryngeal disorder, in about 15–20% of all congenital anomalies, affects vocal fold movement. It may occur unilaterally or, less frequently, bilaterally. Unilateral paralysis is usually idiopathic but may be secondary to peripheral nerve pathology. Strain injuries to the recurrent laryngeal nerve during birth may be one of the causes.

The glottal sulcus (or sulcus vocalis) is characterised by dysphonia due to hampered movement of the mucous membrane, absence of Reinke’s space and adhesion of the epithelium to the vocal ligament or the vocal muscle itself.

The congenital subglottal stenosis takes third place in laryngeal anomalies with approximately 15% of the cases, twice as often in boys than in girls. It may be subdivided into two types, the more abundant is membranous congenital subglottal stenosis, due to submucosal hypertrophy. The second, cartilaginous congenital subglottal stenosis, results from an abnormal growth of the cricoid cartilage.

Subglottal haemangioma accounts for 1.5% of congenital anomalies of the larynx, in girls twice as often than in boys. It results from a malformation of the mesenchymal vascular precursors.

Laryngoceles are rare congenital anomalies of the supraglottal larynx. They form as a result of air- or fluid-filled dilations of the laryngeal ventricle communicating with the laryngeal lumen. They may occur internally or externally or both.

About 25% of all laryngeal cysts are saccular cysts. In contrast to the laryngoceles, they do not communicate with the laryngeal lumen.

Laryngeal webs are rare congenital anomalies. They are due to an incomplete recanalisation of the laryngotracheal tube, which occurs in the third month of gestation. They appear mostly at the anterior level of the vocal folds.

Laryngeal or laryngotracheo-oesophageal clefts are posterior fusion defects between the airway and oesophagus during embryogenesis. These clefts may be minor and short or may even extend beyond the carina. They are classified according to their anatomical extent.

Laryngeal atresia is considered to be the rarest of the congenital anomalies of the larynx. It occurs when the recanalisation of the laryngotracheal tube during the third month of gestation fails. Tongue Development

Normal Development (Figs. 1.19, 1.20 and 1.21)

Tongue development starts with a triangular elevation in the floor of the first pharyngeal arch during the end of the fourth week of gestation, which is called the median tongue bud (tuberculum impar). A pair of mesenchymal swellings in the ventromedial areas of the first pharyngeal arch forms the distal tongue buds (lateral lingual swellings) on either side of the tongue. They are covered by epithelium of ectodermal origin, overgrow the median tongue bud and fuse medially to form the midline sulcus. Sensory innervation of this part is by the lingual branch of the mandibular division of the trigeminal nerve, the nerve of the first pharyngeal arch.


Fig. 1.19

Tongue development, early phase from the fourth week on (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.20

Tongue development, later stage, fourth to fifth month (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.21

Adult tongue, indicating the derivatives of the branchial arches (from Moore and Persaud 2003, courtesy of Elsevier)

Behind the foramen cecum, the second pharyngeal arch develops the copula in the midline. A second elevation, arising from the third and partly the fourth pharyngeal arch, forms the hypobranchial eminence, which will become the pharyngeal part of the tongue.

The copula is overgrown by the hypobranchial eminence in the fifth and sixth week. It will fuse anteriorly with the distal tongue buds, thereby creating the terminal sulcus.

The median and pharyngeal sections of the organ then become joined at the terminal sulcus. This posterior compartment of the tongue is innervated by the glossopharyngeal nerve, the nerve of the third pharyngeal arch, whereas the chorda tympani from the cranial nerve VII supplies the taste buds on the anterior two thirds. The growing tongue extends out into the oral cavity; its anterior part is covered by a layer of ectodermal epithelium. In contrast, the root of the tongue is covered with endodermal epithelium.

So far, only the epithelial and mucosal tissues of the tongue have been considered, which develop from the four pharyngeal swellings as described above. The muscular compartment of the tongue descends from myoblasts that differentiate after migrating from the myotomes of the occipital cervical somites. Following these myoblasts is the hypoglossal nerve, which generates the nerve supply for the tongue musculature.

Tongue Abnormalities

The tongue may vary in its size from microglossia, an abnormal smallness of the tongue, which occurs very rarely, to macroglossia, a more abundant phenomenon, which means that the tongue is extraordinarily large.

Ankyloglossia affects the frenulum of the tongue; it develops short and thick and fixes the tongue to the floor of the mouth (tongue-tied) or at least restricts the movement of the tongue.

A cleft or bifid tongue has a cleft running vertically right across it. Complete clefting is extremely rare and occurs as a result of lack of developmental forces that push both halves of the tongue towards each other. Partial clefting presents as a deep groove in the middle of the tongue.

When the two lateral parts of the tongue fail to overgrowth the tuberculum impar, a bald patch will appear in the centre of the tongue, known as medial rhomboid glossitis. Development of the Ear

Inner Ear, Normal Development (Figs. 1.22 and 1.23)

In the third week of embryonic development, the ectoderm on both sides of the rhombencephalon (hindbrain) begins to thicken and form the otic placodes. They shift caudally to the level of the second pharyngeal arch and invaginate during the fourth week to form the otic pits. The pits separate from the surface to form the otic vesicles, which are the precursors of the membranous labyrinth.


Fig. 1.22

Development of the ear: week 4, 5 (top left, right) and two later stages (bottom left, right) (from Moore and Persaud 2003, courtesy of Elsevier)


Fig. 1.23

Development of the otic vesicle, weeks 5–8 (from Moore and Persaud 2003, courtesy of Elsevier)

Each otic vesicle differentiates into three parts: a dorsomedial, elongated endolymphatic extension, origin of the endolymphatic duct and, at its distal end, the endolymphatic sac; a central partition, which will expand to form the utricle and the three semicircular ducts, arising from utricular diverticula; and a ventral, conical saccular region, which forms the saccule and the cochlear duct, as well as the ductus reuniens joining the saccule and cochlear duct. The duct elongates in the fifth week and starts to coil, with the spiral organ of Corti differentiating in the seventh week. By this time, the organ of Corti is innervated by the cochlear ganglion, which will elongate and wind up together with the organ of Corti.

At the end of the ninth month, the auricular pathway is completed; myelinisation, however, has not taken place, and axo-dendritic synapses are not yet established.

Malformations of the Inner Ear

Malformations in otic vesicle development result in anomalies of the membranous labyrinth and its bony envelope as well. In descending order of intensity and time course of appearance during development, they are complete labyrinthine aplasia; cochlear aplasia; common cavity (single cystic cavity of coalesced cochlea and vestibulum); cochlear hypoplasia; incomplete partition Type I, II or III; and enlargement of the vestibular or cochlear aqueduct.

Tympanic Cavity, Normal Development

The first pharyngeal pouch elongates to form the tubotympanic recess, which will give rise to the tympanic cavity and the auditory tube. By the seventh week, the auditory ossicles begin to condense within the mesenchyme of the first and second pharyngeal arches, whereas the muscles of the middle ear begin to form in the ninth week. The cartilage of malleus and incus develop within the first pharyngeal arch, and its mesoderm gives rise to the tensor tympani muscle, which will be innervated by the nerve of the first pharyngeal arch, the mandibular nerve (CN V/3). The cartilage of the stapes is formed within the second pharyngeal arch, as well as the stapedius muscle. It is therefore innervated by the facial nerve (CN VII), which is the nerve of the second pharyngeal arch.

The first pharyngeal cleft develops to the external acoustic meatus, and the membrane, separating the first pharyngeal cleft from the first pharyngeal pouch, becomes the tympanic membrane, which consists of three layers: an outer covering of ectoderm, a mesodermal layer (the fibrous stratum) and an inner lining of endoderm.

In the ninth month, the ossicles assume their functional relationships, with the malleus attaching to the eardrum and the stapes attaching to the oval window. Sound vibrations can now be transmitted from the eardrum to the cochlea via the ossicles and oval window and then transduced into neural impulses via the organ of Corti.

Malformations of the Middle Ear

The close relationship of the external ear canal and the tympanic cavity gave rise to the classification of a common malformation termed atresia auris congenita:

  • First-degree malformations are characterised by moderate deformations of the external ear canal, a normal or slightly hypoplastic tympanic cavity, deformed ossicles and normal pneumatisation of the mastoid.

  • The second-degree malformation exhibits intermediate deformities including an absence of the external ear canal or its blind ending, a narrow tympanic cavity, deformations and fixations of the ossicles and reduced mastoid pneumatisation.

  • Third-degree malformations include the absence of an external ear canal, hypoplastic tympanic cavity, severely deformed ossicles and a failure in mastoid pneumatisation.

External Ear, Normal Development (Fig. 1.24)

Each of the adjacent ectodermal parts of the first and second pharyngeal arches differentiates into three auricular hillocks. They arise in the fifth week. In the seventh week, the auricular hillocks begin to enlarge, differentiate and fuse, producing the final shape of the ear, which is gradually translocated from the side of the neck to a more cranial and lateral site. The first pharyngeal arch gives rise to the tragus, the helix and the cymba conchae; the second pharyngeal arch forms the antitragus, the antihelix and the concha.


Fig. 1.24

Development of the external ear (from Paulsen et al. 2010, courtesy of Elsevier)

Anomalies of the External Ear

Malformations of the external ear have their causes in an inaccurate development of a single or a combination of several auricular hillocks. They result in deformities of three grades of severity: dysplasia grade I represents only a slight deformation, most elements of a normal pinna are present. Moderate deformations are summarised in dysplasia grade II. Only some structures of a normal ear are identifiable. Dysplasia grade III is characterised by severe deformations. Nothing of a normal pinna is recognisable.

Malformations may be further classified according to the size of the auricle (macrotia, microtia, anotia), the shape of the ear (cup-shaped, lop ear, ear dysplasia, elfin (pointed) ear, lobe malformations), the position of the ears (melotia, low set ears, synotia) and other malformations such as auricular fistulas or appendages.

1.2.2 Anatomy The Palate

The hard palate is generated by two types of bone, which are covered by a mucous membrane: the palatine processes of the maxillae and the horizontal parts of the palatine bones (Fig. 1.25). These bones continue into the soft palate, which contains a membranous aponeurosis. The soft palate, also called velum palatinum, is a movable, fibromuscular fold that is attached to the posterior edge of the hard palate. It separates the superior nasopharynx from the inferior oropharynx. Laterally, the soft palate is continuous with the wall of the pharynx and is joined to the tongue and pharynx by the palatoglossal and palatopharyngeal folds.


Fig. 1.25

Aspect of the mouth and palate (from Paulsen et al. 2010, courtesy of Elsevier)

The components are as follows:

The levator veli palatini, extending from the cartilage of the auditory tube and petrous part of temporal bone to the palatine aponeurosis. It elevates the soft palate, drawing it superiorly and posteriorly and also opens the auditory tube to regulate air pressure in the middle ear. It is innervated by a pharyngeal branch of the vagus via the pharyngeal plexus.

The tensor veli palatini arises from the scaphoid fossa of the medial pterygoid plate, spine of sphenoid bone and cartilage of auditory tube to the palatine aponeurosis. It tenses the soft palate by using the hamulus as a pulley. It also acts on the membranous portion of the auditory tube in the same sense as the levator muscle. Innervation is through the medial pterygoid nerve (a branch of the mandibular nerve).

The musculus uvulae, which emanates at the posterior nasal spine and palatine aponeurosis and inserts into the mucosa of uvula. When the muscle contracts, it shortens the uvula and pulls it upwards. The pharyngeal branch of vagus innervates the muscle via the pharyngeal plexus.

The palatoglossus muscle between the palatine aponeurosis and the side of tongue. The mucous membrane covering the muscle forms the palatoglossal arch. The muscle elevates the posterior part of the tongue and draws the soft palate downwards onto the tongue.

The palatopharyngeus muscle, extending from the hard palate and palatine aponeurosis to the lateral wall of pharynx. Its mucous membrane forms the palatopharyngeal arch. The muscle tenses the soft palate and pulls the walls of the pharynx upwards, forwards and medially during swallowing. Both muscles are supplied by the cranial part of accessory nerve (CN XI) joining with the pharyngeal branch of vagus via the pharyngeal plexus.

The sensory nerves of the palate, which are branches of the pterygopalatine ganglion, are the greater (major) and lesser (minor) palatine nerves (Fig. 1.26). They accompany the arteries through the greater and lesser palatine foramina, respectively.


Fig. 1.26

Sensory innervation of the soft palate

The palate has an abundant blood supply from branches of the maxillary artery. The Pharynx

The pharynx is a fibromuscular tube that spans vertically from the base of the skull to the oesophagus. Being situated posterior to the nasal and oral cavities and posterior to the larynx, it is therefore divisible into the nasopharynx, oropharynx and laryngopharynx, which ends at the inferior border of the cricoid cartilage, where it becomes continuous with the oesophagus.

The anterior part of the nasopharynx communicates through the choanae with the nasal cavities. Its lateral walls contain the pharyngeal ostia of the auditory tube, bounded behind by the torus tubarius, a prominence of the mucous membrane caused by the medial end of the cartilage of the tube. On the posterior wall of the nasopharynx, an assembly of lymphatic tissue is located, known as the pharyngeal tonsil.

The oropharynx, or mesopharynx, lies behind the oral cavity, extending from the uvula to the level of the hyoid bone. It opens anteriorly, through the isthmus faucium, into the mouth. The anterior wall consists of the base of the tongue; the superior wall consists of the inferior surface of the soft palate and the uvula. Its entrance, the isthmus faucium, is formed by the palatoglossal and palatopharyngeal arches of each side of the oral cavity, between them the palatine tonsil is positioned.

The laryngopharynx extends from the superior border of the epiglottis to the inferior border of the cricoid cartilage, where it becomes continuous with the oesophagus. Its anterior wall is the rear of the epiglottis and the posterior aspects of the arytenoid and cricoid cartilages. The piriform recess is part of the cavity of the laryngopharynx, situated on each side of the inlet of the larynx.

The lateral and posterior walls (Fig. 1.27) of the three parts of the pharynx are formed by various muscles: two of them, the palatopharyngeal muscle with its origin in the soft palate and the salpingopharyngeal muscle, originating at the auditory tube, are longitudinally orientated and form the innermost muscular layer. They are covered by the three constrictors, the upper (superior), middle and lower (inferior) constrictor muscle.


Fig. 1.27

Dorsal aspect of the pharynx (from Benninghoff and Drenckhahn 2003, courtesy of Elsevier)

These three pharyngeal constrictors originate from antero-laterally placed structures:

  • The superior constrictor emanates from the pterygomandibular raphe, the pterygoid hamulus and the buccinator ridge of the mandible. The right and left muscles run posteriorly and superiorly. Their superior attachment is to the pharyngeal tubercle on the base of the skull, and the largest part of the muscle meets its companion muscle from the opposite side to form a midline pharyngeal raphe.

  • The middle constrictor originates from the hyoid bone and the stylohyoid ligament and meets its partner at the pharyngeal raphe.

  • The inferior constrictor originates from the oblique line on the cricoid and thyroid cartilages. It meets its partner to contribute to the midline posterior raphe.

Finally, the stylopharyngeus muscle, beginning at the styloid process, runs into a gap between the upper and the middle constrictor and ends at the thyroid cartilage. All of these muscles are of the striated type.

Altogether, the tubulo-muscular wall of the pharynx consists of four layers: a mucous membrane, the pharyngeal aponeurosis, the muscle layer and the buccopharyngeal fascia.

The motor nervous and most of the sensory nervous supply to the pharynx is by way of the pharyngeal plexus, which, situated mainly on the middle constrictor, is formed by the pharyngeal branches of the vagus and glossopharyngeal nerves and also by sympathetic nerve fibres.

Blood supply of the pharynx is ensured by pharyngeal branches of the ascending pharyngeal artery, ascending palatine artery, descending palatine artery and pharyngeal branches of inferior thyroid artery. Veins collect the blood into the pharyngeal plexus. The Larynx

The larynx is located in the anterior neck, ventrally of the cervical vertebrae 3–6. It connects the pharynx with the trachea and regulates the flow of air to and from the lungs for respiration and vocalisation and guards the air passages against food and liquids entering it. Its ventral prominence is called Adam’s apple. The larynx extends from the tip of the epiglottis to the inferior border of the cricoid cartilage. Its interior can be divided into three parts, the supraglottis, the transglottis and the subglottis (see below).

The skeleton of the larynx is composed of nine cartilages, three single and three paired (Fig. 1.28):


Fig. 1.28

Skeleton of the larynx, the thyroid cartilage has been dissected (from Paulsen et al. 2010, courtesy of Elsevier)

First is the thyroid cartilage, of hyaline nature. Its superior margin and its superior horn are attached to the hyoid bone by the thyrohyoid membrane, centrally and laterally enhanced as the medial or lateral thyrohyoid ligament. Its inferior horn connects to the cricoid cartilage and takes part in the cricothyroid articulation.

The hyaline cricoid cartilage is situated below the thyroid cartilage. It is the only one that encircles the entire larynx. It is attached to the thyroid cartilage via the median cricothyroid ligament and to the first ring of the trachea via the cricotracheal ligament.

Two mostly hyaline arytenoid cartilages of pyramidal shape are positioned dorsally on the superior margin of the cricoid cartilage. They are connected to the vocal ligaments by their vocal process, and their muscular process serves for muscular attachment. Each of them has an elastic corniculate cartilage on its top. The latter connect to the cricoid cartilage via the posterior cricoarytenoid ligament.

Behind the thyroid cartilage protrudes the epiglottis, a spoon-shaped elastic cartilage, which is connected to the thyroid cartilage by the thyroepiglottic ligament. It contacts the arytenoid cartilages via the quadrangular membrane, into which two elastic cuneiform cartilages are embedded.

The most prominent and most important ligaments of the larynx are the vocal ligaments, converging from the vocal processes of the arytenoids to the posterior surface of the thyroid. They serve as a margin for the conus elasticus, extending downwards to the cricoid cartilage.

Two pairs of joints affect the vocal ligaments: the cricothyroid joints allow tilting, and to a small extent gliding between the thyroid and cricoid cartilage, they thereby stretch or loosen the vocal ligaments. The cartilages move by action of the straight and oblique parts of the external cricothyroid muscle (Fig. 1.29). The muscle is innervated by the superior laryngeal nerve, which branches from the main trunk of the vagus nerve.


Fig. 1.29

Outer muscles of the larynx (from Paulsen et al. 2010, courtesy of Elsevier)

The cricoarytenoid joints permit gliding and rotation of the arytenoid cartilages, thus changing the positions of the vocal ligaments. Adductors of the vocal ligaments are the lateral cricoarytenoid muscle, the oblique and transverse arytenoid muscles and the vocalis muscle (by increasing its diameter in isometric contraction). The posterior cricoarytenoid muscle acts as an abductor of the vocal ligaments, whereas the aryepiglottic, thyroarytenoid and vocalis muscles are effective in reducing the tension of the vocal ligaments (Fig. 1.30). All of these muscles are innervated by the recurrent laryngeal nerve, a branch of the vagus nerve.


Fig. 1.30

Inner muscles of the larynx (from Paulsen et al. 2010, courtesy of Elsevier)

The internal cavity of the larynx (Fig. 1.31) is divided into three parts. It starts with the vestibule of larynx (supraglottis), the laryngeal inlet, which extends from the upper border of the epiglottis down to the ventricular folds. These are mucus membrane folds forming the lower free edge of the quadrangular membrane, they run from the thyroid cartilage above the vocal ligament to the arytenoid cartilages. They contain large sero-mucous glands, which serve to moisten the vocal folds. The vestibule continues into the ventricle of the larynx (transglottis), which extends between the vestibular and vocal folds.


Fig. 1.31

Interior of the larynx seen from the dorsal aspect (from Paulsen et al. 2010, courtesy of Elsevier)

The vocal folds extend from the angle of thyroid to the vocal processes of arytenoid cartilages. They are important for phonation by controlling the stream of air through the rima glottidis, the variable cleavage between them. They alter the shape and size of the wedge-shaped rima glottidis by movement of the arytenoids to ensure respiration or phonation.

Below the vocal folds, the subglottal space extends to the lower border of the cricoid cartilage.

On both sides of the laryngeal inlet, the piriform recesses ensure continuity between the pharynx and the beginning of the oesophagus, where they meet. They are bordered medially by the aryepiglottic fold, laterally by the thyroid cartilage and the thyrohyoid membrane (Fig. 1.32).


Fig. 1.32

Horizontal section of the larynx above the vocal ligaments (from Paulsen et al. 2010, courtesy of Elsevier)

The epithelium of the vocal fold (Fig. 1.33) is of the non-keratinised stratified squamous type. It changes to a ciliated pseudostratified epithelium on the posterior glottis, ventricular folds and trachea. The lamina propria may be divided into three layers according to its histological composition: a superficial layer, pliable and flexible, also called Reinke’ space, of loose connective tissue. It is densely interwoven with the epithelium by digital projections containing small vessels. It continues into an intermediate and a deep layer. There is an increase in the presence of fibrous and interstitial proteins in the intermediate and, to a greater extent, in the deep layer of the lamina propria, both making up the vocal ligament. The intermediate layer is marked with a distinct elevation in the relative amount of elastin and collagen, and this increases even more within the deep layer (Jette and Thibeault 2011). The deep layer is adjacent to the vocal muscle, which may be considered as a medial part of the thyroarytenoid muscle.


Fig. 1.33

Histology of the vocal fold (from Klinger and Schramm 2001, courtesy of Prof. Klinger)

The larynx as a whole is embedded into the vertically running muscle cords of the anterior neck (Fig. 1.34). Two longer muscles, the sternohyoid and the omohyoid, cover a group of shorter muscles ventrally, inserting directly at the thyroid. These extrinsic muscles may be divided into two groups: to the elevators (mainly suprahyoid muscles) of the larynx belong muscles that pull the hyoid upwards (the digastric muscle, the mylohyoid, the genioglossus, the stylohyoid and the stylopharyngeus) and, acting directly on the thyroid, the thyrohyoid muscle; depressors of the larynx are the sternothyroid, the omohyoid and the sternohyoid muscle, the latter two by pulling the hyoid downwards.


Fig. 1.34

Muscles of the anterior neck (from Paulsen et al. 2010, courtesy of Elsevier)

The entire larynx is innervated by the vagus nerve: the nerve separates a superior branch that leaves the main trunk high in the neck. Approximately at the level of the hyoid bone, this superior laryngeal nerve divides into an external and an internal branch. The only function of the external branch is the motoric innervation of the cricothyroid muscle.

The internal branch passes through a foramen in the thyrohyoid membrane together with the superior laryngeal artery and vein. It provides general sensation, including pain, touch and temperature for the tissue superior to the vocal folds.

The lower part of the larynx is supplied by the recurrent laryngeal nerve. It contains motor fibres to innervate all the intrinsic muscles of the larynx—except for the cricothyroid muscle—as well as both sensory and secretory fibres to the glottis, subglottis and trachea. The right recurrent laryngeal nerve leaves the vagus nerve, which parallels the internal jugular vein, near the point where the brachiocephalic trunk divides. The left recurrent laryngeal nerve emanates from the vagus nerve near the aortic arch. Both branches cross dorsally below the adjacent vessel and ascend laterally next to the trachea. They often terminate in forming an anastomosis with the ipsilateral internal branch of the superior laryngeal nerve.

The larynx has its arterial supply from the superior laryngeal artery, a branch of the superior thyroid artery, which accompanies the internal laryngeal nerve, and by the inferior laryngeal artery from the inferior thyroid artery, which runs parallel to the recurrent laryngeal nerve. The Tongue

The relaxed tongue takes up most of the space inside the oral cavity. It basically comprises muscles surrounded by a mucous membrane. The posterior one third of the tongue, the root, is attached to the floor of the oral cavity. The mobile anterior two thirds of the tongue is called the body, and the tip is the apex.

The surface, or dorsum, contains numerous projections of the mucous membrane called papillae. They contain taste buds, which can sense five types of sensations: sweet, salty, sour, bitter and umami, which is a savoury meaty flavour. In addition, serous glands of the mucosa secrete some of the fluid of the saliva.

The inferior surface of the tongue is covered by a thin transparent membrane. A large fold of mucosa, called the frenulum, runs down the midline. The ducts of the submandibular salivary glands open at the base of the frenulum.

The muscles of the tongue are divided by the lingual septum. Four pairs are intrinsic, and four pairs are extrinsic (Table 1.2, Fig. 1.35).

Table 1.2

Internal and external muscles of the tongue







Superior longitudinal muscle

Submucosal fibrous layer and septum

Margins of the tongue and mucous membrane

Curls the tongue upwards and shortens it

Inferior longitudinal muscle

Root of the tongue and hyoid bone


Curls the tongue downwards and shortens it

Transverse muscle

Septum of the tongue

Lateral margins of the tongue

Narrows and protrudes the tongue

Vertical muscle

Submucosal fibrous layer of the dorsum of the tongue

Inferior surfaces of the borders of the tongue

Flattens and broadens the tongue



Genioglossus muscle


Entire dorsum of the tongue and hyoid bone

Protrudes the tongue and assists with other movement

Hyoglossus muscle

Hyoid bone

Inferior and lateral parts of the tongue

Depresses and shortens the tongue

Styloglossus muscle

Styloid process of temporal bone

Posterior parts of the tongue

Retracts the tongue and curls its sideways

Palatoglossus muscle

Palatine aponeurosis

Posterolateral parts of the tongue

Elevates the posterior part of the tongue and depresses the soft palate


Fig. 1.35

External muscles of the tongue (from Paulsen et al. 2010, courtesy of Elsevier)

All muscles of the tongue, except for the palatoglossus, are innervated by the hypoglossal nerve (CN XII). The palatoglossus is innervated by the pharyngeal plexus (CN X).

The somatosensory innervation of the anterior two thirds of the tongue comes with the mandibular nerve via the lingual nerve; the visceral sensory innervation is by the facial nerve via the chorda tympani. The posterior 1/3 part of the tongue has somatosensory and visceral innervation from the glossopharyngeal nerve. The somatosensory innervation of the root is by the vagus nerve (Fig. 1.36).


Fig. 1.36

Sensory innervation of the tongue

The tongue gains its blood by the lingual artery, a branch of the external carotid artery. It is drained by lingual veins, which continue into internal jugular vein. Swallowing

Swallowing is a complex series of sequential neuromuscular events that are integrated into a smooth and continuous process, which is divided into three stages: oral, pharyngeal and oesophageal.

The oral phase of swallowing can be further subdivided into the oral preparatory and the oral transport phase. In the oral preparatory phase, the lips, tongue, mandible, palate and cheeks act in common with salivary flow to form food into a consistency and position appropriate for the subsequent phases of swallowing. Once the food bolus is prepared, the oral transport phase occurs, as the musculature of the lips and cheeks contract, followed by tongue contraction against the hard palate. The soft palate elevates as a consequence of contraction of the tensor veli palatini, levator veli palatini and palatopharyngeus muscles. Thereby a reflux of food into the nasal cavity is prevented.

The anterior two thirds of the tongue are critical in the oral phase of deglutition. The posterior one third of the tongue, the tongue base, plays an important role in propelling a food bolus posteriorly towards the pharynx.

The nerves involved so far are the trigeminal nerve (CN V) to control general sensation to the face and motor supply to the muscles of mastication, the facial nerve (CN VII) to supply taste to the anterior two thirds of the tongue and motor function to the lips, the glossopharyngeal nerve (CN IX) to provide general sensation to the posterior third of the tongue and the hypoglossal nerve (CN XII) to enable movements of the tongue.

Once the food bolus touches the palatoglossal folds, the pharyngeal phase of swallowing reflexively begins.

When the swallowing reflex is initiated, the following reactions take place: velopharyngeal closure to prevent reflux of material into the posterior choana. This is affected by contraction of the levator veli palatini muscles, which elevate the soft palate against the posterior nasopharyngeal wall. Medial contraction of the lateral pharyngeal wall musculature and a slight anterior movement of the posterior pharyngeal wall create Passavant’s ridge, against which the velum is approximated during the initiation of the pharyngeal phase of swallowing. The pharyngeal constrictor muscles contract in a superior-to-inferior direction. The epiglottis inverts to cover the larynx and prevent aspiration of contents into the airway. This retroversion of the epiglottis directs the food bolus laterally towards the pyriform sinuses. The vocal folds adduct to prevent aspiration.

With contraction of the superior pharyngeal constrictor muscle, laryngeal elevation occurs. The larynx elevates following the anterior movement of the hyoid bone and tongue base owing to contraction of the mylohyoid, geniohyoid, stylohyoid and anterior digastric muscles. This anterior movement of the larynx combined with the contraction of the middle and inferior constrictor muscles forces the food bolus inferiorly, initiating the final portion of the pharyngeal phase, which is the entry of the food bolus into the cervical oesophagus.

The mylohyoid nerve, branch of CN V3, supplies the mylohyoid and the anterior digastric muscles. The stylohyoid muscle is innervated by branches of the facial nerve (CN VII), and the geniohyoid muscle receives fibres from the first cervical nerve, which joins the hypoglossal nerve. The pharyngeal constrictors have their nervous supply through the glossopharyngeal (CN IX) and the vagus (CN X) nerves. The Ear

Outer Ear

The pinna or auricle (Fig. 1.37) is a prominent skin-covered flap located on the side of the head and is the external visible part of the ear. It is shaped and supported by cartilage except for the earlobe. The outer verge of the ear is called the helix, and the inner elevated rim is the antihelix, which originates from the fusion of two crura, between which is a triangular depression, the fossa triangularis. The deepest depression, which leads to the ear canal, is known as the concha. It is overlapped by the tragus, a small cartilaginous flap that can be pushed down to block the opening to the ear canal.


Fig. 1.37

The ear, overview (from Benninghoff and Drenckhahn 2003, courtesy of Elsevier)

The pinna collects sound waves and directs them to the external ear canal. Its shape also partially shields sound waves that approach the ear from the rear, therefore enabling a person to tell whether a sound is coming directly from the front or the back.

The external auditory canal (meatus acusticus externus) begins at the bottom of the concha and ends at the tympanic membrane. It is approximately 2.5–3 cm long and slightly S-curved. It is supported by cartilage at its first third and by the bone for the rest of its length. It exhibits two narrowings, one near the inner end of the cartilaginous portion and another, the isthmus, within the osseous part. The whole tube is lined by the skin and contains glands that produce secretions that mix with dead skin cells to produce cerumen (earwax).

Middle Ear

The outer ear ends, and the middle ear (Fig. 1.38) begins, at the tympanic membrane, commonly known as the eardrum. It lies in the tympanic cavity within the temporal bone. The cavity connects to the nasal part of the pharynx via the auditory tube. It is therefore filled with air, normally of the same atmospheric pressure as the outer ear. This ensures that the tympanic membrane can swing freely between both ear parts. Within the tympanic cavity, the vibrations of the tympanic membrane are transmitted to the oval window, the beginning of the inner ear, by a chain of ossicles: the malleus, which has a handle that attaches to the inner surface of the eardrum and a head that is suspended from the wall of the tympanic cavity; the incus, which is connected by its body to the head of the malleus and by its long arm to the stapes. Both its body and its short arm are fixed to the wall of the tympanic cavity by ligaments. The third ossicle, the stapes, has an arch and a footplate. The arch connects to the incus, whereas the footplate is held by a ring-like piece of tissue in the oval window, which is the entrance into the inner ear.


Fig. 1.38

The middle ear (from Zilles and Tillmann 2010, courtesy of Springer)

Two muscles exert influence on the movements of the middle ear bones: the tensor tympani (innervated via a branch of the mandibular nerve), whose tendon inserts on the medial part of the malleus, pulls the malleus medially, tensing the tympanic membrane, damping its vibration and thereby reducing the amplitude of sounds. The other one, the stapedius muscle (innervated by a branch of the facial nerve), inserts into the posterior neck of the stapes and reflexively lessens its vibrations by pulling its head backwards.

Inner Ear

The oval window, where the footplate of the stapes is fixed, is the beginning of the inner ear, a system of osseous cavities within the petrosal part of the temporal bone, called the labyrinth (Fig. 1.39). It consists of three components, the vestibule (vestibulum), the three semicircular canals and the cochlea.


Fig. 1.39

The labyrinth of the inner ear (from Paulsen et al. 2010, courtesy of Elsevier)

The Organs of Equilibrium: Vestibulum and Semicircular Canals

The organ of equilibrium (Fig. 1.40) consists of five compartments, two saccular organs, the utriculus and the sacculus, and three semicircular canals, orientated at right angles to each other, corresponding to the three dimensions of space (Speckmann et al. 2013). The inner membranous compartments of these cavities contain endolymph and do not directly contact their osseous walls but are separated by a small space filled with perilymph. Via the ductus endolymphaticus, the five membranous compartments are continuous to the endolymphatic sac, which in turn has contacts to the dura mater. Both of them have absorptive and secretory functions and regulate the volume and composition of the endolymph. This endolymph is not identical with the endolymph of the cochlear duct (see below) but differs in its ion content.


Fig. 1.40

Components of the organ of equilibrium (green) and their sensory areas (from Speckmann et al. 2013, courtesy of Elsevier)

The sensory element of the sacculus, as well as of the utriculus, is named the macula (Fig. 1.41). It is a plain area containing roughly 16,000 or 30,000 hair cells. They are covered by a gelatinous layer, in which otoliths, small crystals of calcium carbonate, are embedded. This layer has a higher density than the endolymph, and when the otolith membrane is moved, induced either by bending the head or by linear acceleration or deceleration, the ciliary hairs of the sensory cells are declined and evoke depolarisations or hyperpolarisations that trigger the spontaneous activity of the vestibular nerve fibres. The macula utriculi is nearly horizontally orientated and reacts mainly to changes of horizontal movements (e.g. to increasing or reducing speeds of a car); the macula sacculi, on the other hand, is in an approximately vertical position and therefore registers vertical accelerations (e.g. to movements of a lift).


Fig. 1.41

The macula organ. Resting position (left), macula in action (right) (from Speckmann et al. 2013, courtesy of Elsevier)

The semicircular ducts (Fig. 1.42), the membranous components of the semicircular canals, start and end within the recessus ellipticus of the vestibulum. The end of one haunch of each of them is enlarged to an ampulla and is the domicile of the sensory organ. Each ampulla contains a so-called crista ampullaris, a specialised connective tissue with approximately 7000 hair cells. They are covered by a dome of jelly-like material (cupula ampullaris), which does not touch the cells directly: only the sensory hairs of the cells contact the cupola through a small gap. The cupola has nearly the same density as the endolymph, so it exerts no direct pressure on the sensory cells. Inertness of the endolymph together with the cupola evokes an opposite movement within the semicircular ducts, contrary to turning movements of the head, and induces a shear force on the hairs of the sensory cells, which causes depolarisation or hyperpolarisation, depending on the direction of turning.


Fig. 1.42

The organ of the semicircular ducts. Resting position (left), activation by turning of the head (right) (from Speckmann et al. 2013, courtesy of Elsevier)

The Cochlea (Fig. 1.43)

The vestibulum continues into the cochlea, which winds by 2½ turns around a section of spongy bone called the modiolus. The modiolus is shaped like a screw whose threads, the lamina spiralis ossea, form a spiral platform that supports the membranous parts of the cochlea. The cochlea measures about 35 mm in length. It contains three fluid-filled chambers separated by membranes. The upper chamber, scala vestibuli, and the bottom chamber, scala tympani, are filled with perilymph. They communicate with each other at the top of the modiolus, the helicotrema and the scala tympani ends at the round window, which again continues into the middle ear but is closed by the secondary tympanic membrane. Between these two perilymphatic ducts, the triangular scala media or ductus cochlearis is expanded. Its basilar membrane extends laterally from the osseous spiral lamina to the outer wall of the cochlea. It separates the scala media from the scala tympani and carries the organ of Corti. Its inclining roof, a more subtle membrane, is called Reissner’s membrane and serves as boundary against the scala vestibuli. The lateral wall of the scala media consists of a specialised epithelium, the stria vascularis (as it contains blood vessels); it generates the endolymph, with which the scala media is filled.


Fig. 1.43

Cross-section of cochlear spiral canal (from Paulsen et al. 2010, courtesy of Elsevier)

The basilar membrane and its medial continuation, the osseous spiral lamina, carry the organ of Corti, which changes pressure waves into nervous impulses.

The organ of Corti (Fig. 1.44) is composed of a series of epithelial structures. As a central part, the inner and outer rods or pillars of Corti flank a triangular tunnel, the tunnel of Corti. On both sides of the tunnel, the inner and outer hair cells are located. These are short columnar cells; their free ends are level with the heads of Corti’s rods. The approximately 3500 inner hair cells are arranged in a single row on the medial side of the inner rods. They do not reach the basilar membrane but are positioned within supporting phalangeal cells. This facilitates abundant contacts of nerve endings with the cell bodies. The name ‘hair cells’ originates from ciliary structures on their tips, which contact the tectorial membrane. The outer hair cells (approx. 12,000) are nearly twice as long as the inner ones. They are supported by outer phalangeal cells, the cells of Deiters. A space exists between the outer rods of Corti and the adjacent hair cells; this is called the space of Nuel. The Deiters’ cells are neighboured by five or six rows of columnar cells, the supporting cells of Hensen, followed by another group of columnar cells, the cells of Claudius. Covering the sulcus spiralis internus and the spiral organ of Corti is the tectorial membrane, which is attached to the limbus laminae spiralis close to the inner edge of the vestibular membrane. It has contact with the cilia of the inner as well as of the outer hair cells.


Fig. 1.44

The organ of Corti (from Benninghoff and Drenckhahn 2004, courtesy of Elsevier)

The so-called cochlear transduction process of sound, from air waves to nervous impulses, is the result of several steps.

Vibrations of the eardrum are transmitted through the three ossicles of the middle ear, which induce pressure waves within the scala vestibuli. These pressure waves generate a travelling wave on the basilar membrane of the scala media. The basilar membrane vibrations cause shear movements between the tectorial membrane and the stereocilia of both types of hair cells, resulting in deflection of the stereocilia and activation of ion channels. Thereby the mechanical stimulus is transduced, and receptor potentials are generated. In inner hair cells, these receptor potentials induce neurotransmitter release and action potential generation in the synapses of auditory nerve fibres.

The receptor potentials of the outer hair cells, however, initiate a contraction in the longitudinal axis of the cells, which influences the basilar membrane’s motions (Fettiplace and Hackney 2006; Ashmore 2008). The cells act in a sense of a ‘cochlear amplifier’, a mechanism that increases both the amplitude and frequency selectivity of basilar membrane vibration for low-level sounds. These activities of the outer hair cells can be measured as otoacoustic emission (OAE). Efferent nerve fibres contact the outer hair cells by crossing the tunnel of Corti medially as radial fibres. They are thought to adjust the resting membrane potential of these cells, thereby regulating the amount of feedback provided to the basilar membrane. Auditory Pathway and Vestibular Tracts

Auditory Pathway

About 90% of the afferent nerve fibres that leave the organ of Corti come from the inner hair cells. They are Type I nerve fibres, i.e. thick, myelinated and fast conducting. The remaining 10% are afferent fibres from the outer hair cells; they are slow-conducting Type II fibres and cross the tunnel of Corti as basilar tunnel fibres. The cell bodies of all of these fibres form the spiral (or cochlear) ganglion, embedded within the central part of the modiolus. Their axons continue as the acoustic nerve (pars cochlearis of the vestibulocochlear nerve, cranial nerve VIII) and enter the brainstem (Fig. 1.45) (ten Donkelaar 2011).


Fig. 1.45

The auditory pathway (from Benninghoff and Drenckhahn 2004, courtesy of Elsevier)

Each of the nerve fibres diverges, one branch projects rostrally to the dorsal cochlear nucleus, the other projects caudally to the ventral cochlear nucleus. The cochlear nuclei contain second-order neurons, which generally project to higher centres by an ipsilateral or, after decussating, by a contralateral pathway.

The ventral cochlear nucleus projects to the superior olivary complex, whereas fibres of the dorsal cochlear nucleus bypass the superior olivary complex and directly enter the lateral lemniscus to reach the inferior colliculus.

The superior olivary complex consists of the medial nucleus of the superior olive, the lateral nucleus of the superior olive and the medial nucleus of the trapezoid body. Both nuclei of the superior olive receive fibres from the ipsilateral and contralateral ventral cochlear nucleus. The medial nucleus analyses time differences of neuronal signals; the lateral nucleus evaluates differences in intensities. On their way to the nuclei of the superior olive, the contralateral fibres pass through the trapezoid body as a passive relay. In mammals as well as in man, its nuclei, however, are thought to play a role in distinguishing inter-aural intensity differences.

The superior olivary complex sends outputs to the cranial nerves V and VII for reflex contractions of the tensor tympani and stapedius muscles to dampen loud sounds.

The fibres connecting the olivary complex with the inferior colliculus form a lateral tract in the brainstem, called the lateral lemniscus. Within this tract a mass of grey matter is embedded, called the nuclei lemnisci lateralis dorsalis and ventralis. They serve as synaptic relay stations for some of the fibres of the lateral lemniscus.

The end point of the lateral lemniscus is the inferior colliculus. It serves as an auditory relay and reflex centre, where information derived directly from the dorsal cochlear nucleus and from the olivary complex may be compared. Moreover, it receives inputs from the somatosensory system. The inferior colliculi of both sides contact each other by commissural fibres.

The inferior colliculus projects to the medial geniculate nucleus or medial geniculate body. This nucleus is part of the thalamus and serves as a thalamic relay on the way to the auditory cortex. From their neuronal morphology, a number of subdivisions can be distinguished. They have different afferent and efferent connections, and they are thought to be involved together in the direction and maintenance of attention. The fibres leaving the medial geniculate body join the internal capsule as the radiatio acustica and terminate in the primary auditory cortex, the Brodmann’s areas 41 and 42 within the superior temporal gyrus.

Nervous activation of individual parts of the auditory pathway may be recorded as the auditory brainstem response (ABR) upon a stimulus that evokes a response of the cochlea. The first two waves correspond to true action potentials of the cochlear nerve. Later waves may reflect postsynaptic activities (afferent and efferent) in brainstem auditory centres:

  • Wave I is a response of the auditory nerve action potential in the distal portion of the cochlear nerve (cranial nerve VIII).

  • Wave II is generated by the proximal part of the cochlear nerve as it enters the brainstem.

  • Wave III arises from second-order neuron activities in or near the cochlear nucleus.

  • Wave IV is said to arise from the superior olivary complex, but additional contributions may come from the cochlear nucleus and nucleus of lateral lemniscus.

  • Wave V is believed to originate from the vicinity of the inferior colliculus.

  • Waves VI and VII are suggested to have their origins in the medial geniculate body of the thalamus.

Beginning in the auditory cortex, several descending pathways exist. The centrifugal fibres run close to, but usually not within, the tracts containing the auditory afferent pathways. They project to the medial geniculate bodies, the inferior colliculi and other midbrain nuclei; from the inferior colliculi, they descend to the superior olivary complex. From here the olivo-cochlear bundles travel down to the inner and outer hair cells of the cochlea. They carry information from the superior olivary complex, the nuclei of the lateral lemniscus, the reticular formation and the inferior colliculus. These bundles have two compartments: a lateral one, containing unmyelinated fibres, which end on the afferent fibres of the inner hair cells, and a medial one with myelinated fibres, which medially crosses the tunnel of Corti and terminates directly on the bodies of the outer hair cells.

The pathway of the acoustic startle reflex is part of the auditory pathway from the ear up to the nucleus of the inferior colliculus. Mainly from there but also from the dorsal nucleus of the lateral lemniscus and from parts of the superior olivary complex, it follows connections to nuclei of cranial nerves and motor centres in the reticular formation. The motoneurons of the ventral root are activated, and moving of the body is initiated, by the reticulospinal tract in the anterolateral part of the spinal cord. The inferior colliculi also project to the superior colliculi, important relay stations in the visual pathway, coordinating optical reflexes.

Learned reflexes (e.g. turning upon hearing one’s name) are conditioned reflexes. They include the auditory pathway and other complex parts of the central nervous system.

The Vestibular Tracts

The axons emanating from the ganglia of the vestibular nerves (Fig. 1.46) project to the ipsilateral vestibular nuclei, a group of four major and several minor nuclei in the rhombencephalon (hindbrain). There they join afferent fibres from the cerebellum, from proprioceptors of the deeper neck region, as well as afferent fibres from the optical system via the superior colliculi. One of their projections is to the ventral posterior nucleus (the somatosensory nucleus) of the thalamus. These projections are bilateral (crossed and not crossed). The thalamus forwards the vestibular information to vestibular areas in the cerebral cortex. They are not as precisely circumscribed as the areas of the other sensory systems and tend to overlap each other.


Fig. 1.46

Ascending and descending connections of the vestibular nuclei, brainstem, dorsal view (modified from Nieuwenhuys et al. 2008, courtesy of Springer)

Ascending fibres of the vestibular nuclei join the medial longitudinal fascicle and enter the interstitial nucleus of Cajal (Nieuwenhuys et al. 2008), which serves as a coordination centre for eye and head movements. Over this route, the vestibular nuclei also project to ipsilateral and contralateral nuclei of ocular muscles: nuclei of the oculomotorius, abducens and trochlear nerve. These nerves stimulate the six external muscles of the eye, located in three perpendicular planes, roughly colinear with the planes of the semicircular canals. Excitatory subunits connect a single semicircular canal to the eye muscles, initiating a compensatory eye movement in the plane of the semicircular canal, whereas their antagonists are inhibited.

The vestibulo-collic reflex produces head movements in the planes of the stimulated canals. More than 30 neck muscles are innervated from the upper cervical cord and receive excitatory or inhibitory inputs, or both, from all six semicircular canals. The motoneurons are contacted by the vestibulospinal system, including lateral, medial and crossed tracts.

Descending fibres form the vestibulospinal tracts and join the reticulospinal tract, by which they reach the motor neurons of skeletal muscles, predominantly activating the extensor and inhibiting the flexor muscles.

One main target of the vestibular tracts is the cerebellum. By mossy fibres, they reach a special region, the vestibular cerebellum, mainly consisting of the flocculonodular lobe, combining the nodulus and the flocculus. The efferent fibres of this region act synergistically on the oculomotor and spinal motor systems to maintain balance in upright movements (Trepel 2004). Neuroanatomical Basics of Language

The cerebrum is divided into four main parts: the frontal lobe is separated from the parietal lobe by the central sulcus and houses capabilities such as planning, motivation, working memory and motor functions including speech production. The parietal lobe is the main sensory part of the brain. The gyri immediately adjoining the central sulcus are the precentral gyrus, origin of the pyramidal tract on the motoric site, and the postcentral gyrus as the primary sensory centre on the parietal side. The parietooccipital sulcus separates the parietal lobe from the occipital lobe, where among others the visual areas are domiciled. The lateral sulcus (or fissura sylvii) separates the temporal lobe on the one hand from the frontal and parietal lobe on the other. The auditory centre is located in the temporal lobe, where acoustic radiation, emanating from the medial geniculate body, terminates.

The whole cerebrum has been mapped by Korbinian Brodmann (1868–1918) according to the cytoarchitectonic patterns of its various parts. These areas are widely used to describe functional compartments of the brain.

An important part involved in language (Fig. 1.47) is Wernicke’s area, located in the posterior part of the superior temporal gyrus in Brodmann’s area 22 and part of area 39. It is predominant in the left hemisphere of right-handed persons and plays a key role in understanding words. Spoken words arrive at Wernicke’s area directly via the auditory cortex in areas 41 and 42 (Shalom and Poeppel 2008). Visual inputs reach this area from the eye via the visual cortices in Brodmann’s areas 17, 18 and 19 and the ‘read and write centre’ in the angular gyrus (Brodmann’s area 39). Wernicke’s area projects to Broca’s area and has connections to the prefrontal cortex by longitudinal association fibres (Friederici 2009).


Fig. 1.47

The brain surface; areas active in language (modified from Paulsen et al. 2010, courtesy of Elsevier)

Vocalising language requires motor outputs from the lower regions of the precentral primary motor cortex (motor areas for the face, mouth, tongue and larynx). The plan and coordination of these motor outputs originate in Broca’s area, which lies in the Brodmann’s regions 44 and 45. It joins the lower part of the precentral cortex, receives inputs from Wernicke’s area via the arcuate fasciculus and projects to the above-mentioned corresponding primary motor areas of both hemispheres (to the contralateral side via the corpus callosum), as well as to the basal ganglia, which are thought to play a role in giving timing cues for the correct motor activity during speech (Fig. 1.47).

Recently, Hickok and Poeppel proposed a dual pathway of language processing (Fig. 1.48): on the basis of the findings that speech recognition is bilaterally organised in the superior temporal gyrus, they postulate a bilateral ventral stream comprising the primary auditory area, Heschl’s gyrus, and neighbouring areas of the superior temporal gyrus (including Wernicke’s area), adjacent areas of the middle and the inferior temporal gyrus. These areas are important in speech perception. The speech production is generated by a unilateral dorsal stream in the left hemisphere including interconnections between the supramarginal gyrus and the dorsal premotor portion (Brodmann’s area 6) as well as the area of Broca together with the lower motor gyrus of Brodmann’s area 4. The Broca region also connects to the anterior temporal lobe (anterior part of Brodmann’s area 21), which is supposed to play a role in processing complex elements of language (Hickok 2009).


Fig. 1.48

Scheme of the functional anatomy of language processing. Two broad processing streams are depicted, a ventral stream for speech comprehension that is largely bilaterally organised and that flows into the temporal lobe and a dorsal stream for sensory-motor integration that is left-dominant and that involves structures at the parietal-temporal junction and frontal lobe. ATL anterior temporal lobe; Aud auditory cortex (early processing stages); BA 45/44/6 Brodmann’s areas 45, 44 and 6; MTG/ITG middle temporal gyrus, inferior temporal gyrus; PM premotor, dorsal portion; SMG supramarginal gyrus; Spt Sylvian parietal-temporal region (left only); STG superior temporal gyrus; red line Sylvian fissure; yellow line superior temporal sulcus (STS) (from Hickok 2009, courtesy of Elsevier)

1.3 Physics, Acoustics, Psychoacoustics

Kurt Stephan

As phoniatrics is strongly linked to the phenomena of sound and sound perception, a thorough understanding of the nature of sound is essential for interpretation of diagnostic results in phoniatrics as well as for the application of therapeutic approaches to hearing and speech pathologies.

1.3.1 Sound Waves

In contrast to light waves or other electromagnetic waves present in our everyday environment, sound waves are mechanical vibrations of small particles (typically molecules) of a medium (e.g. air, water). Sound waves are created by a vibrating object, called the sound source. The vibration of a sound source results in a periodic or aperiodic compression and rarefaction of the particles of the medium, causing local fluctuations of the density that propagate through the medium. Hence, a sound wave can be thought of as a periodic or aperiodic displacement of interacting particles that travel through the medium from one location to another resulting in a transport of energy. If the particles vibrate along the axis of sound propagation, the wave is called longitudinal; if they vibrate perpendicular to the axis of propagation, the wave is called transverse. The medium can be thought of as the material through which the wave propagates, which can be a gas, a fluid or a solid substance. Without a medium sound cannot be produced.

Sound waves in air and water are always longitudinal, and they travel through the medium at a speed which is specific for the medium. In air the speed of sound at sea level is 340 m/s; in water it is more than four times higher, i.e. 1484 m/s, and in solids it is about 15 times higher. Most often the perception of sound in humans is based on the propagation of sound waves in air.

1.3.2 Physical Properties of Sound

Sound is primarily characterised by wavelength, amplitude and frequency. The wavelength (λ, lambda) is defined as the distance of two successive wave crests or troughs of a sound wave, the amplitude (A) as the maximum deviation from the resting pressure. The frequency (f), which determines the pitch of a tone, is defined as the number of pressure fluctuations per unit time. Frequency, wavelength and sound velocity (c) are related to each other by the following equation (Eq. 1.1):

$$ f\left[\mathrm{Hz}\right]=c\left[\mathrm{m}/\mathrm{s}\right]/\lambda \left[\mathrm{m}\right] $$


Everyday sounds do not only contain one particular frequency but are rather a mixture or superposition of frequencies and amplitudes with different temporal variation (Fig. 1.49). Only a pure, continuous sine-shaped oscillation is regarded as a pure tone. Sound generated by a musical instrument typically consists of a fundamental frequency (i.e. the lowest tone) and a series of resonant frequencies called overtones. The combination of the fundamental and resonant frequencies creates the typical timbre of the instrument, which allows the listener to recognise it. When two tones with very similar frequencies and amplitudes are superimposed, a third tone with a lower frequency is generated.


Fig. 1.49

Time course (upper graphs) and frequency spectra (lower graphs) of typical sound waves: pure tone, 440 Hz, complex tones: superposition of three tones (440, 880 and 1100 Hz), and noise. λ = wavelength. For the noise a wavelength cannot be assigned

The relevant measurement for characterising the strength of sound waves is sound pressure (p). It is defined as the local pressure deviation from the ambient atmospheric pressure caused by a sound wave. Its unit is the Pascal [Pa]. Pressure deviations caused by sound waves are very small relative to atmospheric pressure. The human ear is able to perceive them in the range between about 20 μPa (20 × 10−6 Pa) and 20 Pa, where 20 μPa approximately corresponds to the sound pressure at hearing threshold of a 1000 Hz tone. Owing to the huge range of our ear’s auditory sensitivity, measurements and calculations in Pa are laborious. For this reason, a more handy measure has been introduced: the sound pressure level (SPL) with the unit decibel [dB SPL]. The sound pressure level is calculated by a logarithmic transformation of the measured sound pressure according to the formula (Eq. 1.2):

$$ {\displaystyle \begin{array}{c}L\left[\mathrm{dB}\kern0.5em \mathrm{SPL}\right]=20\times \log p/{p}_0\\ {}\kern0.625em \left(\mathrm{with}\kern0.5em {p}_0=20\mu Pa\right).\end{array}} $$


According to this formula, the hearing threshold for a 1000 Hz tone lies at 0 dB SPL (as measured by a sound level meter). Every increase in sound pressure level by 20 dB indicates an increase in sound pressure by the factor 10. Owing to the logarithmic nature of the dB scale, its values cannot simply be added or subtracted but must be retransformed into Pa before arithmetic operations can be made. By doing so, one finds that doubling the sound pressure of a sound source leads to an increase in SPL of 6 dB.

1.3.3 Propagation of Sound

In gases and liquids, sound propagates as longitudinal waves, whereas in solids transverse propagation of sound waves is also found. It should be noted that propagation of sound does not mean the movement of particles but only transport of energy. During propagation, the waves can be reflected, diffracted or attenuated by the medium. The modalities of sound propagation depend on the properties of the medium through which the sound travels. Additional factors that influence sound propagation are obstacles along the wave’s pathway and changes of the type of medium.

The fact that sounds can be heard around corners and barriers involves the mechanisms of diffraction and reflection. When diffraction occurs, the sound can be thought of being ‘bent around’ an obstacle. This effect occurs mainly with low frequencies (for which the wavelength is of the order of the size of the obstacle) and is the reason we hear low frequencies better than high frequencies behind obstacles. With high frequencies (with wavelengths much smaller than the size of the obstacle) the so-called shadow effect occurs, meaning that they appear attenuated behind the obstacle. The shadow effect caused by our head helps our auditory system to recognise the direction from which high-frequency sounds come.

For recognising the direction whence low-frequency sounds originate, our auditory system evaluates the time difference between a sound’s arrival at the ipsilateral and the contralateral ear (inter-aural time difference).

Another property of a medium that is important for sound propagation is the characteristic acoustic impedance (Z). Z determines how much sound pressure (p) is generated by a vibration of molecules of a particular acoustic medium at a given frequency. It is defined as

$$ Z=p\left[\mathrm{Pa}\right]/v\left[\mathrm{m}/\mathrm{s}\right] $$

where v is the velocity of the sound in the particular medium (Eq. 1.3). The acoustic impedance is particularly relevant for the propagation of sound waves when the medium through which the sound travels is changing, e.g. from air to water. As the acoustic impedance of water is about 3500 times higher than that of air, a large amount of the sound will be reflected at the surface of the water. A similar mismatch of impedances occurs in the human ear when the sound is travelling from the ear canal to the inner ear.

To overcome the huge difference in impedances, impedance-matching is accomplished by the ossicles of the middle ear. This matching, interpreted as ‘mechanical amplification with no input of additional energy’, ensures that the sound is effectively transmitted from the ear canal to the fluids of the inner ear.

1.3.4 Perception of Sound and Psychoacoustics

The way sound is perceived in humans is described by psychoacoustics (Fastl and Zwicker 2007; Moore 1997). More specifically, psychoacoustics is the scientific field concerned with the psychological and physiological responses associated with sound, including speech and musical stimuli.

The range of frequencies perceived by the human ear reaches from 16 to 20,000 Hz in normal-hearing young individuals. The upper frequency limit of hearing declines with age, an effect called presbyacusis.

The human ear is not equally sensitive to all frequencies. Its highest sensitivity lies between 2000 and 5000 Hz where sound pressure levels even below 0 dB SPL can be perceived. As shown in Fig. 1.50, the hearing threshold is strongly dependent on the frequency of sound. In our daily acoustic environment, sound, and also speech in a conversational situation, does not occur at threshold levels but rather at suprathreshold intensities. For such sounds the experienced loudness is a very important subjective quantity.


Fig. 1.50

Range of hearing and equal loudness contours for 20, 40, 60, 80 and 100 phon (data from Suzuki and Takeshima 2004, processed and graphic generated by the author). Lowest curve (lower red line): hearing threshold. Also shown is the range of conversational speech (green area) and the threshold of pain (upper red line). The sound pressure level in dB SPL equals the subjective sound level in phon at 1000 Hz

1.3.5 Loudness

As the human ear is not equally sensitive to all frequencies, two tones are not perceived equally loud when presented at an equal sound pressure level. For instance, if a 1000 Hz tone is presented at a 20 dB level, it is experienced as loud as a 63 Hz tone presented at 60 dB SPL. Subjectively rated equal loudness levels of different tones are reported in specific curves called isophones or equal loudness contours. The unit of the loudness level of isophones is the phon, where the phon values at 1000 Hz equal the values of dB SPL. Whereas isophones exhibit a very steep dependence on frequency at low sound pressure levels, they show a much flatter dependence at high sound pressure levels.

Another way to judge the intensity of the perceived loudness is by using a ‘categorical’ scale that encompasses the whole range of categories by which we determine the loudness of everyday sounds. The loudness categories are defined in terms such as ‘inaudible’, ‘very soft’, ‘soft’, ‘medium’, ‘loud’ and ‘too loud’ as a judgement of loudness to stimuli presented at different sound pressure levels. Such scaling is particularly useful for the diagnosis of recruitment (pathological reduction of the auditory dynamic range) and for the fitting of hearing aids and hearing implants.

To account for the different sensitivity of the human ear to different frequencies, especially at low sound levels, an additional scale has been developed, the so-called dB (A) scale. The dB (A) scale weights the sound levels by a filtering function (A-filter), which approximates the sensitivity of the human ear at the isophone curve at 20–40 phon (see Fig. 1.50). By applying this filter function to a measured sound level, the low frequencies and the high frequencies contribute less to the total level of the weighted sound thereby mimicking the loudness perception of the human ear. The dB (A) scale is particularly used in quantifying levels of background noise for acoustic testing and for determining levels of noise exposure in natural or working environments.

1.3.6 Masking of Sound

The perception of sound signals, e.g. tones or speech, is also influenced by the presence of competing noise, an effect called masking. Masking plays an important role in everyday listening and in audiometry. Generally, the presence of a masking noise (called a masker) raises the hearing threshold of a tone. The masking effect is largest when the frequencies of the signal and of the masker are similar, and it decreases with increasing distance between their frequencies. However, the decrease is not symmetrical with respect to masker frequency: masking is more effective when a low-frequency sound masks a high-frequency signal than vice versa. In this case an effect called upward spread of masking occurs. To attain the opposite effect, the masker (i.e. the high-frequency sound) must be of an intense sound level. Owing to the nature of the human cochlea, upward spread of masking is more prominent and is important for understanding speech in noisy environments. It has to be particularly considered for the fitting of hearing aids and hearing implants.

Depending on the masker’s frequency spectrum and on its temporal properties (e.g. gaps between sounds), the perception of signals can be influenced in a variety of ways. Masking can also occur in a different temporal relationship from that of the signal: as either simultaneous masking, forward masking or backward masking.

1.3.7 Binaural Hearing

Compared with monaural hearing, binaural hearing allows the auditory system to make better use of the information contained in sounds. Main advantages of binaural hearing are (1) the ability to localise sound sources and (2) the ability to achieve improved speech intelligibility in noise. Localisation

The ability to identify the location or origin of a sound is called sound localisation (Blauert 1996). Mechanisms involved in sound source localisation include the evaluation of:

  • Differences in arrival time of sounds between the right and the left ear (inter-aural time difference, ITD).

  • Differences in the levels of high-frequency sounds between the right and the left ear (inter-aural level difference, ILD).

  • Asymmetrical spectral reflections from various parts of the bodies, including torso, shoulders and pinnae.

For detecting the spatial origin of low-frequency (about <800 Hz) sound, the auditory system evaluates inter-aural time differences (ITD), e.g. a sound coming from the right side arrives earlier at the right ear than at the left ear. Contrarily, for high frequencies (about >1600 Hz), the inter-aural level differences (ILD) are predominantly evaluated by the brain. The ILDs are mainly due to the head shadow effect, as the level of sound from the right causes a higher sound pressure level on the right ear than on the left ear.

Localisation in the median plane is possible as the human outer ear (pinna and the ear canal) forms acoustic filters that are sensitive to the direction of the incoming sound. Different resonances of these filters cause direction-specific patterns into the frequency responses of the ears, which are evaluated by the brain. The combination of these patterns with other direction-selective reflections at the head, shoulders and torso forms the so-called head-related transfer functions (HRTF) which are specific for each individual. In closed rooms, not only the direct sound from a sound source reaches the ears but also sound that has been reflected from the walls. When localising a sound source under such conditions, the auditory system analyses only the original direct sound (which arrives first at the ear); not the reflected sound (which arrives later). This feature is called the ‘law of the first wave front’, as the analysis of the reflected sound is suppressed. Speech Intelligibility in Noise

Hearing in noise is one of the more challenging tasks in today’s acoustic environment, as background noise is nearly always present in daily listening situations.

The presence of noise may cause substantial masking of acoustic signals and in particular the masking of speech, the most common being the cocktail party effect. For speech understanding in noisy environments, binaural hearing and proper functioning of the inner ear are crucial. The brain’s comparison of the right and left auditory inputs will then result in a significant internal noise reduction called binaural release from masking.

Speech intelligibility in noise is most often quantified by the signal-to-noise ratio, i.e. the ratio between the sound level of the signal (speech) and that of the noise at the point where the listener achieves 50% speech intelligibility. The signal-to-noise ratio is an important measure for assessing the benefit of hearing aids or cochlear implants. Besides advantages of localisation abilities, the improvement of hearing in noise is the main argument for a binaural supply with hearing prostheses.

1.4 Concise Overview of the Physiology of Hearing

Ad Snik

1.4.1 Anatomy and Physiology of the Hearing Organ

The hearing organ, schematically presented in Fig. 1.37 taken from Sect. 1.2, can be subdivided into four parts:

  • The external part, comprising the auricle and external auditory canal. This part of the ear collects the airborne sounds and transports these sounds to the entrance of the middle ear, the tympanic membrane.

  • The middle ear is an air-filled cavity that comprises the tympanic membrane and (coupled) the middle ear ossicles. The middle ear system transforms the sound waves into mechanical vibrations that are led to the entrance of the inner ear, the oval window.

  • The inner ear or cochlea, a fluid-filled, spiral-formed organ that comprises the sensory cells. The mechanical vibrations of the middle ear system are propagated as vibrations through the cochlear fluids. These vibrations activate the sensory cells in the cochlea. In fact, the vestibular system and the cochlea form one organ often referred to as the audiovestibular system. The vestibular system is addressed in Sects. 1.2 and 16.​15.

  • The auditory neural system that transports the action potentials generated by the activated sensory cells to the brainstem and (sub)cortical auditory areas.

If we look in more detail (Fig. 1.37 from Sect. 1.2), first of all at the outer ear, the auricle or pinna is characterised by certain folds, which play an important part in the processing of high frequencies that we use for localisation of sound sources, in particular in the vertical plane. Furthermore, the auricle is important for separating frontal and rear sources because its shape causes sounds from the front or rear to be diffracted differently, resulting in slight but detectable differences in the frequency spectrum. The tympanic membrane at the end of the external canal has relatively low impedance, comparable to the acoustic impedance of sound in air. As a result, airborne sound waves efficiently vibrate the tympanic membrane. Its low-power but relatively high-amplitude vibrations are transformed into more powerful vibrations with low amplitude, by the ossicular chain system. Thus this system works as an ‘impedance’ match that effectively transforms the acoustic energy of the airborne sound wave into mechanical energy of the fluid wave in the cochlea, a mechanical amplification procedure that requires no additional energy to perform. The ossicular system is connected to the tympanic membrane via the malleus at one end and to the cochlea at the other. The last ossicle, the stapes, moves piston-wise in the oval window of the cochlea (Slis and Snik 1997a).

The cochlea comprises a spiral-formed twisted canal with approximately 2.5 turns. Figure 1.51 presents schematically the unrolled cochlea. In fact, three different parallel canals can be distinguished, called the scala, the scala media and the scala vestibuli (see Fig. 1.51). Stapes-induced vibrations of the oval window, at the base of the cochlea, cause the fluid in the scala vestibuli to be set into motion, resulting in a longitudinal fluid wave. That motion reaches the top of the cochlea, or the apex, where the scala vestibuli and the scala tympani are connected (illustrated in Fig. 1.51). The longitudinal wave reverses its direction through the scala tympani, towards the second window in the base of the cochlea, the round window. At the round window, the energy of the fluid wave is dissipated at the round window. The third scala is a separated canal in the middle, not connected to the other two scalae. It contains fluid, endolymph, with different ion concentrations than the fluid in the other two scalae, the perilymph. The scala vestibuli is separated from the scala media by a thin, highly mobile membrane, the membrane of Reissner. For the longitudinal waves in the scala vestibuli, mechanically, this membrane does not exist. This means that the longitudinal wave on either side of the (stiffer) membrane between the scala media and scala tympani, the basilar membrane, creates pressure differences over this membrane, causing up-down movements of that membrane. These movements look like a transversal travelling wave. The basilar membrane comprises the sensory elements that are activated by the transversal movements of the membrane. These elements are called the organ of Corti (Slis and Snik 1997b).


Fig. 1.51

The unravelled cochlea comprising three compartments, the scala vestibuli, the scala tympani and, in between, the scala media with the sensory elements

At the base of the cochlea, the basilar membrane is relatively thick and narrow. At the apex, the basilar membrane is relatively thin and broad. Between these two extremes, the width of the membrane and its thickness change rather linearly. As a result, the stiffness of the membrane is much higher at the base than at the apex. For high frequencies, the basilar membrane is highly mobile near the base, while for low frequencies, the highest mobility is found near the apex. As a consequence, the mobility of the basilar membrane is frequency-dependent from the base to the apex; when stimulated, the amplitude of the travelling wave will peak at a certain distance from the base, where the distance is determined by the frequency of the sound. This is referred to as tonotopic organisation of the basilar membrane.

The organ of Corti is situated on top of the basilar membrane (see Fig. 1.44 taken from Sect. 1.2), close to the modiolus. Along the full length of the basilar membrane, this organ contains four parallel rows of hair cells, one row of inner hair cells (closest to the modiolus) and three rows of outer hair cells.

Inner and outer hair cells are situated on either side of a rather solid ridge, the tunnel of Corti. Each hair cell is connected to neurons; the inner hair cells mainly to afferent neurons and the outer hair cells mainly to efferent neurons. The differences in anatomy and neural connections suggest that inner hair cells and outer hair cells have different functions. Above the hair cells, the tectorial membrane is situated, which is (only) connected to the cochlear wall near the modiolus. The tips of the cilia of the hair cells are connected to the tectorial membrane, such that they bend when the basilar membrane moves relative to the stiff tectorial membrane. The connection of the cilia to the tectorial membrane is tighter for the outer hair cells than for the inner hair cells (Green 1976).

The endolymph, in the scala media, and the perilymph, in the other two scalae, have different ion concentrations. This causes a potential difference of about 80 mV over the membranes that separate the scala media from the other two scalae, the basilar membrane and membrane of Reissner. This difference in potential is kept constant by ion pumps in the stria vascularis, which covers the outer wall of the scala media. Positively charged ions are pumped from the perilymph to the endolymph. When the cilia of hair cells bend, an influx of positive ions causes depolarisation of the hair cell, which subsequently leads to an action potential in the afferent neuron connected to that hair cell. Inner hair cells are the main sensors. However, only loud sounds stimulate the cilia of the inner hair cells forcefully enough to cause depolarisation. Outer hair cells have another function; apart from sensory properties, they also have motor properties. These cells can amplify the relatively small movements of the basilar membrane caused by soft sounds (the ‘active mechanism’ or ‘cochlear amplifier’, which requires the input of additional energy via the efferent neurons), with subsequent sufficiently strong movements of the inner hair cells for depolarisation. These motoric actions of the outer hair cells are controlled by neural feedback loops (Moore 2008) (see Sect. and are the main loci of the non-linearity of the cochlear response.

1.4.2 Perception of Sounds Fundamental Concepts

Our perception of sound is based on logarithmic scales. Concerning loudness, this has led to the introduction of the decibel scale, which is a logarithmic measure of physical sound pressure levels. In pitch perception, we use the octave scale, which is a logarithmic measure of frequency.

Our ‘hearing range’ is often visualised in a graph of sound pressure levels in decibels (dB) versus frequency in Hertz (Hz) organised into octaves (the interval of one octave represents a doubling of frequency, so that 250 Hz, 500 Hz, 1 kHz and 2 kHz, e.g. are octave steps). Section 1.3 Fig. 1.50 shows the hearing range of a normal-hearing subject, which is determined by the frequency-dependent threshold of hearing (in dB SPL, dB sound level pressure; lower limit) and the loudness discomfort level (also in dB SPL; the blue line marked 100). Adolescents have the best hearing, in a frequency range from 20 Hz to 20 kHz, with a hearing range at 1 kHz from approximately 0 to 110–125 dB SPL. Normal thresholds of hearing differ at each frequency and are worse at the low and high frequencies, as displayed in Sect. 1.3 Fig. 1.50, which shows the standardised normal threshold (the lower red line). For this reason, the hearing thresholds of an individual subject are usually not plotted in dB SPL but in dB HL, or dB Hearing Level, which is relative to the standardised normal-hearing threshold. This standardised normal threshold is 0 dB HL at all frequencies. Thus, hearing thresholds in dB HL are 0 for a normal-hearing person. In clinical practice, however, a normal range of <20 dB HL is typically used. A further decibel scale, dB SL, or dB Sensation Level, is sometimes used, especially in research, to present sounds at a set level above each subject’s threshold. Any individual’s threshold (in dB SPL or dB HL) can be represented as 0 dB SL. 20 dB SL is therefore 20 dB above the individual’s threshold. The loudness discomfort level now equals approximately 100 dB HL for all frequencies. For diagnostic purposes, hearing thresholds and loudness discomfort levels are only measured in the frequency range from 0.25 to 8 kHz. The resulting graph, called an audiogram, is displayed in Fig. 1.52. Note that in an audiogram, hearing levels are presented upside-down. For counselling purposes, the speech dynamic range is indicated, which is the mean sound pressure level of normal conversational speech as a function of frequency, together with its range of 30 dB, expressed in dB HL. During normal running speech, the speech sound levels will be within this area 90% of the time.


Fig. 1.52

Audiogram format with the speech area indicated in grey (according to Mueller and Killion 1990). The red symbols (open dots) are hearing thresholds of a hearing-impaired patient; the T symbols refer to his loudness discomfort levels

Only gold members can continue reading. Log In or Register to continue

Apr 26, 2020 | Posted by in OTOLARYNGOLOGY | Comments Off on of Phoniatrics
Premium Wordpress Themes by UFO Themes