The new medical field was at first denominated as voice and speech pathology (Stimm- und Sprachheilkunde). The closest approximation, however, to the present term ‘phoniatrics’ was found in the term ‘phoniatros’ (1886), the telegram address of the London laryngologist Morell Mackenzie (1837–1892).
Voice and speech pathology was initially developed from two centres: Berlin and Vienna. Albert Gutzmann (1839–1910), a highly motivated teacher of the deaf in Berlin, also worked with speech/language impairment, particularly stuttering. He organised courses and edited a journal of medicine and pedagogy (Medizinisch-pädagogische Monatsschrift) as of 1891 together with his son Hermann, then a medical student.
International students worldwide flocked to Berlin to study under Hermann Gutzmann. Thirteen books and more than 300 articles offer evidence of his scientific achievement (complete bibliography in Wendler 1980). His main work, ‘Sprachheilkunde’ (Gutzmann 1912), was standard reference of the discipline for many years. The Berlin school of phoniatrics was based on natural sciences, physiology and phonetics; its students were known as the ‘organists’.
The internist Kussmaul had demonstrated multiple close relations between speech and language disorders with neurology and psychiatry and detailed the cerebral origins of language and speech. Both Gutzmann and Fröschels attached their departments to otolaryngology with the more peripheral structures and functions in focus, covering the fields of voice, speech/language and hearing, without ignoring the central functions. This latter tradition is still alive in several areas and corresponds to a communicative approach.
1.1.3 After the Second World War
The structure and content of the field of phoniatrics were defined and determined through close cooperation among several partners, of especial importance are the European Union of the Medical Specialists (UEMS) with Willy Wellens representing the UEP in the beginning and the International Federation of Oto-Rhino-Laryngological Societies, IFOS.
The UEP has launched numerous programmes to shape and define phoniatrics further as the medical speciality for communication disorders and to develop programmes to train and educate competent phoniatricians. A first draft was published by Wendler and Wellens in 1983 (Wendler and Wellens 1983). Within the EU, the harmonisation of such programmes is continuously advancing, after Christiane Neuschaefer-Rube (Germany) currently with Tamer Abou-Elsaad (Egypt) and Tadeus Nawka (Germany) representing phoniatrics within the framework of UEMS with a well-elaborated training programme and logbook (Vilkman et al. 2010, updated 2018), and the European concept of phoniatrics attracts increasing attention worldwide.
Under the Standing IFOS Committee on Phoniatrics and Voice Care (Chair J. Wendler), the special profile of phoniatrics has been generally acknowledged (International Federation of Oto-Rhino-Laryngological Societies 1986). This Committee can be traced back to the Committee on the Care of Voice established in 1969 by the pioneer of phonosurgery, Hans von Leden. In 1993, IFOS recommended that selected phoniatric topics be included in postgraduate ENT training programmes as a basic requirement for their completion (International Federation of Oto-Rhino-Laryngological Societies 1993).
An interdisciplinary organisation, the International Association of Logopedics and Phoniatrics, had been founded in Vienna on the initiative of Emil Fröschels as early as in 1924 (Perelló 1982). He originally named the medical field of speech/language pathology ‘Logopedics’. Hugo Stern and Miloslav Seeman later introduced the term Phoniatrics, which is in common use today to describe communication medicine, whereas the term Logopedics denotes the corresponding non-medical speciality.
1.1.4 Present and Future
Since the 1960s, phoniatrics has extended its scope from the above-outlined concept of physiological and psychological aspects of voice, speech/language and hearing to an all-encompassing perspective of communication including all input, central and output functions as well as sociocultural and ecological dimensions. As the primary function of the articulatory system, swallowing has also been included in the competence of the field. Regarding aetiological studies, molecular genetics has already contributed essential insights, particularly in the field of hearing and developmental language disorders, and as far as stuttering is concerned, genetic factors are being explored with encouraging perspectives. Neurosciences, especially in terms of neurolinguistics, are opening up new ways to the understanding and management of central language processing by means of functional imaging technologies. As the medical speciality for communication disorders, phoniatrics is a worldwide issue today, although with significant geographical differences. The status of phoniatrics varies, in a global view, from an independent speciality on its own to a rather unknown peculiarity, whereas in continental Europe, the cradle of phoniatrics, the speciality is generally well established.
According to an international inquiry in 2012 (Wendler 2012), there were some 1200 specialists in the field: 300 in Italy, 290 in Germany, 210 in Poland, 96 in Czechoslovakia and altogether some 100 university departments. According to a survey from 2016 (Antoinette am Zehnhoff-Dinnesen et al. 2016) we got data about colleagues active in phoniatrics concerning the following countries: 40 in Austria, 10 in Belarus, a couple of dozen in Belgium, 120 in the Czech Republic, many hundreds in Egypt, 23 in Finland, 319 in Germany, 23 in Hungary, 150 in Mexico, more than 200 in Poland, 150 in Russia, 13 in Saudi Arabia, about 100 in Spain, 32 in Switzerland, about 20 in the Netherlands, 15 in Turkey and 135 in Venezuela, in total more than 1650.
According to that survey phoniatrics is an independent specialty in Finland, Germany, Italy, Poland, Egypt, Mexico and Venezuela. It is an officially recognised subspeciality to ENT in many other countries. In several countries, hearing-impaired children are cared for through pedaudiology as an integrated part of phoniatrics. In others, this is a special area of audiology. Considerations to bring phoniatrics and audiology together in terms of a speciality ‘communication medicine’ are being discussed.
For the near future, when rules and regulations for medical specialisation regarding professional profiles and official recognition can be expected to be continuously under discussion, successful cooperation is of greatest importance between UEP with their untiring past president Antoinette am Zehnhoff-Dinnesen (Germany), because of her outstanding merits in rebuilding and further developing the UEP appointed honorary president in 2018, with her inspiring successor Ahmed Geneid (Finland), and with the phoniatric representatives within UEMS. An eminent milestone on the way towards a high level standard of the discipline in all of Europe was the foundation of the European Academy of Phoniatrics in 2013, initiated and finally well established after sustained multiple efforts by Antoinette am Zehnhoff-Dinnesen as the founding director. Christiane Neuschaefer- Rube was elected first president of the academy, mean-time followed by Tadeus Nawka (Germany).
In spite of differing concepts of formal professional formats and independently from systematic orders, the medical challenges of the information age require the general adoption of a recognised special medical field with encompassing compe tence for communication disorders, and that is phoniatrics.
1.2 Developmental and Anatomical Background of Communication and Swallowing Disorders
18.104.22.168 Cranium and Face
The membrane bones of the human skull include the cranial vault (calvaria) and the bones of the face. The bones of the calvaria are separate at birth but will fuse to form sutures, and the fontanelles between these bones will join later after the brain has finished growing.
From week 4 to week 10, the face develops from five facial swellings: paired maxillary swellings, paired mandibular swellings and an unpaired medial frontonasal process. The maxillary swellings enlarge in the fifth week; they lengthen medially and form the primordia of the cheeks and the lateral portions of the upper lip. The lateral portions of the maxillary and mandibular swellings fuse to produce the final shape of the mouth. The mandibular swellings enlarge to form the primordia of the lower lip and jaw in the fourth and fifth weeks. The buccopharyngeal membrane, which separates the ectodermal stomodeum from the endodermal foregut, breaks down on day 24.
In the fifth week, ectodermal thickenings, called nasal placodes, appear on the frontonasal process, which will give rise to the nose and philtrum. Each placode develops a nasal pit in its centre. In the sixth week, its lateral edge, the lateral nasal process, will form the sides of the nose; its medial rim, the medial nasal process, will fuse with its contralateral partner to form the bridge of the nose. During the seventh week, the inferior portion of the fused material forms the intermaxillary process that will join the maxillary swellings to form the philtrum of the upper lip.
The nasal pits enlarge and fuse to form the nasal sac, with the nasal fin developing from its floor to separate the nasal and oral cavities. The nasal fin thins to form the oronasal membrane. It finally ruptures, forming an opening into the oral cavity, called the primitive choana. The primary palate grows posteriorly from the intermaxillary process as a ridge to form the floor of the primitive nasal cavity.
In the eighth week, a pair of palatine shelves initially grows inferiorly from the maxillary swellings into the oral cavity, on either side of the tongue. The shelves rotate horizontally in the ninth week and fuse medially to form the secondary palate. The anterior portion of the secondary palate ossifies to form the hard palate, while muscles of the soft palate develop in its posterior portion. Meanwhile, the nasal septum grows inferiorly from the roof of the nasal cavity, fusing with the top of the hard palate to form two nasal passages that communicate with the pharynx through the definitive choanae.
A split palate may occur solely or combined with a cleft lip. The cleft may be limited to the uvula but may extend across the soft and hard palate. The reason lies in an insufficient generation of mesenchyme, resulting in a disturbed fusion of the lateral maxillary shelves with the nasal septum and the posterior edge of the primary palate.
22.214.171.124 Pharyngeal Arches, Clefts and Pouches
During early development, five pharyngeal (branchial) arches are generated, which appear as bar-like ridges on the ventrolateral surface of the head and neck region. They are covered by ectoderm and are separated from each other by invaginations called pharyngeal clefts. The pharyngeal clefts have counterparts on the interior in the form of endoderm-lined pharyngeal pouches. Ectoderm and endoderm are isolated by a mesodermal core. Pharyngeal membranes separate the clefts from the pouches (Graham 2001).
The pharyngeal arches are numbered 1, 2, 3, 4 and 6; they develop in cranio-caudal sequence with the first pair appearing on day 22, the second and third pairs on day 24 and the fourth and sixth pairs on day 29. Each pharyngeal arch contains an arch cartilage, an arch artery, a mesodermal component as precursor for muscles and a specific cranial nerve.
The first branchial arch is divided into a maxillary and a mandibular process; the former develops to the palatopterygoquadrate bar cartilage, which will become the greater wing of the sphenoid and the incus; the latter contains Meckel’s cartilage, a precursor of the malleus and the fibrous core of the mandible. The jaws mainly consist of membrane bones formed by direct ossification; the maxillary process gives rise to the upper jaw, the maxilla, the zygomatic and the temporal squama, and the mandibular process generates the lower jaw.
The first arch is innervated by the trigeminal nerve, the maxillary swelling by V2 and the mandibular swelling by V3. The second arch is innervated by the facial nerve (VII), the third arch is innervated by the glossopharyngeal nerve (IX), the fourth arch is innervated by the superior branch and the sixth arch is innervated by the recurrent laryngeal branch of the vagus nerve (X).
Branchial arches and their derivatives
Muscles of mastication
Tensor v. palatini
Ant. belly of digastric
Ant. Ligament of the malleus
Hyoid bone, minor horn, upper part of body
Mimic muscle system
Post. belly of digastric
Lining (crypts) of the palatine tonsils (lymphatic follicles have mesodermal origin)
Hyoid, major horn, lower part of the body
Cartilages of the larynx
All muscles of the pharynx
(except the stylopharyngeus)
All muscles of the soft palate
(except the tensor v. palatini)
Upper parathyroid gland
C cells of thyroid
Cartilages of the larynx
All intrinsic muscles of the larynx except the cricothyroid
126.96.36.199 Development of the Larynx
Normal Development (Fig. 1.15)
The larynx develops from the fourth and sixth branchial arches. The laryngotracheal opening lies between these two arches. The internal lining of the larynx originates from endoderm, whereas cartilages and muscles emanate from mesenchyme. The mesenchyme proliferates rapidly, and the sagittal slit of the laryngeal orifice changes into a T-shaped opening by the growth of three tissue masses: one is the hypobranchial eminence, which later becomes the epiglottis. The second and third growths are two arytenoid precursors. They grow between the fifth and seventh week, resulting in a temporary occlusion of the lumen. Recanalisation occurs by the tenth week and produces a pair of lateral recesses, the laryngeal ventricles that are bounded by folds of tissue that differentiate into the false and true vocal cords. Failure to recanalise may result in atresia, stenosis or web formation in the larynx.
The development of the larynx begins with the appearance of the mesenchymal-arytenoid swellings from the sixth branchial arches on the 32nd day of gestation on both sides of the opening of the laryngotracheal tube. These swellings approach each other in the midline and converge at the caudal end of the hypobranchial eminence to convert the vertical laryngotracheal opening into a T-shaped aditus. Midline compression of the tube by these swellings results in the fusion of the epithelial lamina, thereby closing the tube from the pharynx. If the closing does not occur, a posterior laryngeal cleft can result leading to severe aspiration in the newborn. The arytenoid swellings differentiate into the arytenoid and corniculate cartilages and the primitive aryepiglottal folds.
The epiglottal and cuneiform cartilages are formed by the hypobranchial eminence. Chondrification of both fourth branchial arches gives rise to the thyroid cartilage, whereas the cricoid cartilage derives from the chondral tissue of the sixth branchial arch. The laryngeal lumen obliterates to give rise to the epithelial lamina. The larynx recanalises by the tenth week of gestation.
The intrinsic muscles have gained their shapes and positions by the 40th day of gestation, and by the end of the eighth week, all components of the larynx are present including innervation and blood supply.
During the foetal period, the vocal processes develop from the arytenoids, and the thyroid cartilage laminae fuse in the midline. The epiglottal cartilage matures between the fifth and seventh months. During this period, the corniculate and cuneiform cartilages become evident. The foetal period ends with the cricoid cartilage changing from interstitial to perichondrial growth.
The most abundant congenital anomaly of the larynx is the laryngomalacia, accounting for more than a half of all cases (Ahmad and Soliman 2007). The ratio between males and females is about 2:1. It is classified as Type 1, Type 2 or Type 3 on the basis of patterns of supraglottal collapse. In Type 1 laryngomalacia, redundant supraglottal mucosa prolapses; Type 2 is characterised by shortened aryepiglottic folds; and Type 3 displays posterior displacement of the epiglottis coincident with a deformation, due to an imbalance in its development. The epiglottis develops from the cartilages of the third and fourth branchial arches, and an overgrowth of the third arch portion results in an omega-shaped organ. In addition, an arytenoid prolapse may result from immature neuromuscular control.
The second-most common congenital laryngeal disorder, in about 15–20% of all congenital anomalies, affects vocal fold movement. It may occur unilaterally or, less frequently, bilaterally. Unilateral paralysis is usually idiopathic but may be secondary to peripheral nerve pathology. Strain injuries to the recurrent laryngeal nerve during birth may be one of the causes.
The glottal sulcus (or sulcus vocalis) is characterised by dysphonia due to hampered movement of the mucous membrane, absence of Reinke’s space and adhesion of the epithelium to the vocal ligament or the vocal muscle itself.
The congenital subglottal stenosis takes third place in laryngeal anomalies with approximately 15% of the cases, twice as often in boys than in girls. It may be subdivided into two types, the more abundant is membranous congenital subglottal stenosis, due to submucosal hypertrophy. The second, cartilaginous congenital subglottal stenosis, results from an abnormal growth of the cricoid cartilage.
Subglottal haemangioma accounts for 1.5% of congenital anomalies of the larynx, in girls twice as often than in boys. It results from a malformation of the mesenchymal vascular precursors.
Laryngoceles are rare congenital anomalies of the supraglottal larynx. They form as a result of air- or fluid-filled dilations of the laryngeal ventricle communicating with the laryngeal lumen. They may occur internally or externally or both.
About 25% of all laryngeal cysts are saccular cysts. In contrast to the laryngoceles, they do not communicate with the laryngeal lumen.
Laryngeal webs are rare congenital anomalies. They are due to an incomplete recanalisation of the laryngotracheal tube, which occurs in the third month of gestation. They appear mostly at the anterior level of the vocal folds.
Laryngeal or laryngotracheo-oesophageal clefts are posterior fusion defects between the airway and oesophagus during embryogenesis. These clefts may be minor and short or may even extend beyond the carina. They are classified according to their anatomical extent.
Laryngeal atresia is considered to be the rarest of the congenital anomalies of the larynx. It occurs when the recanalisation of the laryngotracheal tube during the third month of gestation fails.
188.8.131.52 Tongue Development
Behind the foramen cecum, the second pharyngeal arch develops the copula in the midline. A second elevation, arising from the third and partly the fourth pharyngeal arch, forms the hypobranchial eminence, which will become the pharyngeal part of the tongue.
The copula is overgrown by the hypobranchial eminence in the fifth and sixth week. It will fuse anteriorly with the distal tongue buds, thereby creating the terminal sulcus.
The median and pharyngeal sections of the organ then become joined at the terminal sulcus. This posterior compartment of the tongue is innervated by the glossopharyngeal nerve, the nerve of the third pharyngeal arch, whereas the chorda tympani from the cranial nerve VII supplies the taste buds on the anterior two thirds. The growing tongue extends out into the oral cavity; its anterior part is covered by a layer of ectodermal epithelium. In contrast, the root of the tongue is covered with endodermal epithelium.
So far, only the epithelial and mucosal tissues of the tongue have been considered, which develop from the four pharyngeal swellings as described above. The muscular compartment of the tongue descends from myoblasts that differentiate after migrating from the myotomes of the occipital cervical somites. Following these myoblasts is the hypoglossal nerve, which generates the nerve supply for the tongue musculature.
The tongue may vary in its size from microglossia, an abnormal smallness of the tongue, which occurs very rarely, to macroglossia, a more abundant phenomenon, which means that the tongue is extraordinarily large.
Ankyloglossia affects the frenulum of the tongue; it develops short and thick and fixes the tongue to the floor of the mouth (tongue-tied) or at least restricts the movement of the tongue.
A cleft or bifid tongue has a cleft running vertically right across it. Complete clefting is extremely rare and occurs as a result of lack of developmental forces that push both halves of the tongue towards each other. Partial clefting presents as a deep groove in the middle of the tongue.
When the two lateral parts of the tongue fail to overgrowth the tuberculum impar, a bald patch will appear in the centre of the tongue, known as medial rhomboid glossitis.
184.108.40.206 Development of the Ear
Each otic vesicle differentiates into three parts: a dorsomedial, elongated endolymphatic extension, origin of the endolymphatic duct and, at its distal end, the endolymphatic sac; a central partition, which will expand to form the utricle and the three semicircular ducts, arising from utricular diverticula; and a ventral, conical saccular region, which forms the saccule and the cochlear duct, as well as the ductus reuniens joining the saccule and cochlear duct. The duct elongates in the fifth week and starts to coil, with the spiral organ of Corti differentiating in the seventh week. By this time, the organ of Corti is innervated by the cochlear ganglion, which will elongate and wind up together with the organ of Corti.
At the end of the ninth month, the auricular pathway is completed; myelinisation, however, has not taken place, and axo-dendritic synapses are not yet established.
Malformations of the Inner Ear
Malformations in otic vesicle development result in anomalies of the membranous labyrinth and its bony envelope as well. In descending order of intensity and time course of appearance during development, they are complete labyrinthine aplasia; cochlear aplasia; common cavity (single cystic cavity of coalesced cochlea and vestibulum); cochlear hypoplasia; incomplete partition Type I, II or III; and enlargement of the vestibular or cochlear aqueduct.
Tympanic Cavity, Normal Development
The first pharyngeal pouch elongates to form the tubotympanic recess, which will give rise to the tympanic cavity and the auditory tube. By the seventh week, the auditory ossicles begin to condense within the mesenchyme of the first and second pharyngeal arches, whereas the muscles of the middle ear begin to form in the ninth week. The cartilage of malleus and incus develop within the first pharyngeal arch, and its mesoderm gives rise to the tensor tympani muscle, which will be innervated by the nerve of the first pharyngeal arch, the mandibular nerve (CN V/3). The cartilage of the stapes is formed within the second pharyngeal arch, as well as the stapedius muscle. It is therefore innervated by the facial nerve (CN VII), which is the nerve of the second pharyngeal arch.
The first pharyngeal cleft develops to the external acoustic meatus, and the membrane, separating the first pharyngeal cleft from the first pharyngeal pouch, becomes the tympanic membrane, which consists of three layers: an outer covering of ectoderm, a mesodermal layer (the fibrous stratum) and an inner lining of endoderm.
In the ninth month, the ossicles assume their functional relationships, with the malleus attaching to the eardrum and the stapes attaching to the oval window. Sound vibrations can now be transmitted from the eardrum to the cochlea via the ossicles and oval window and then transduced into neural impulses via the organ of Corti.
Malformations of the Middle Ear
The close relationship of the external ear canal and the tympanic cavity gave rise to the classification of a common malformation termed atresia auris congenita:
First-degree malformations are characterised by moderate deformations of the external ear canal, a normal or slightly hypoplastic tympanic cavity, deformed ossicles and normal pneumatisation of the mastoid.
The second-degree malformation exhibits intermediate deformities including an absence of the external ear canal or its blind ending, a narrow tympanic cavity, deformations and fixations of the ossicles and reduced mastoid pneumatisation.
Third-degree malformations include the absence of an external ear canal, hypoplastic tympanic cavity, severely deformed ossicles and a failure in mastoid pneumatisation.
External Ear, Normal Development (Fig. 1.24)
Anomalies of the External Ear
Malformations of the external ear have their causes in an inaccurate development of a single or a combination of several auricular hillocks. They result in deformities of three grades of severity: dysplasia grade I represents only a slight deformation, most elements of a normal pinna are present. Moderate deformations are summarised in dysplasia grade II. Only some structures of a normal ear are identifiable. Dysplasia grade III is characterised by severe deformations. Nothing of a normal pinna is recognisable.
Malformations may be further classified according to the size of the auricle (macrotia, microtia, anotia), the shape of the ear (cup-shaped, lop ear, ear dysplasia, elfin (pointed) ear, lobe malformations), the position of the ears (melotia, low set ears, synotia) and other malformations such as auricular fistulas or appendages.
220.127.116.11 The Palate
The components are as follows:
The levator veli palatini, extending from the cartilage of the auditory tube and petrous part of temporal bone to the palatine aponeurosis. It elevates the soft palate, drawing it superiorly and posteriorly and also opens the auditory tube to regulate air pressure in the middle ear. It is innervated by a pharyngeal branch of the vagus via the pharyngeal plexus.
The tensor veli palatini arises from the scaphoid fossa of the medial pterygoid plate, spine of sphenoid bone and cartilage of auditory tube to the palatine aponeurosis. It tenses the soft palate by using the hamulus as a pulley. It also acts on the membranous portion of the auditory tube in the same sense as the levator muscle. Innervation is through the medial pterygoid nerve (a branch of the mandibular nerve).
The musculus uvulae, which emanates at the posterior nasal spine and palatine aponeurosis and inserts into the mucosa of uvula. When the muscle contracts, it shortens the uvula and pulls it upwards. The pharyngeal branch of vagus innervates the muscle via the pharyngeal plexus.
The palatoglossus muscle between the palatine aponeurosis and the side of tongue. The mucous membrane covering the muscle forms the palatoglossal arch. The muscle elevates the posterior part of the tongue and draws the soft palate downwards onto the tongue.
The palatopharyngeus muscle, extending from the hard palate and palatine aponeurosis to the lateral wall of pharynx. Its mucous membrane forms the palatopharyngeal arch. The muscle tenses the soft palate and pulls the walls of the pharynx upwards, forwards and medially during swallowing. Both muscles are supplied by the cranial part of accessory nerve (CN XI) joining with the pharyngeal branch of vagus via the pharyngeal plexus.
The palate has an abundant blood supply from branches of the maxillary artery.
18.104.22.168 The Pharynx
The pharynx is a fibromuscular tube that spans vertically from the base of the skull to the oesophagus. Being situated posterior to the nasal and oral cavities and posterior to the larynx, it is therefore divisible into the nasopharynx, oropharynx and laryngopharynx, which ends at the inferior border of the cricoid cartilage, where it becomes continuous with the oesophagus.
The anterior part of the nasopharynx communicates through the choanae with the nasal cavities. Its lateral walls contain the pharyngeal ostia of the auditory tube, bounded behind by the torus tubarius, a prominence of the mucous membrane caused by the medial end of the cartilage of the tube. On the posterior wall of the nasopharynx, an assembly of lymphatic tissue is located, known as the pharyngeal tonsil.
The oropharynx, or mesopharynx, lies behind the oral cavity, extending from the uvula to the level of the hyoid bone. It opens anteriorly, through the isthmus faucium, into the mouth. The anterior wall consists of the base of the tongue; the superior wall consists of the inferior surface of the soft palate and the uvula. Its entrance, the isthmus faucium, is formed by the palatoglossal and palatopharyngeal arches of each side of the oral cavity, between them the palatine tonsil is positioned.
The laryngopharynx extends from the superior border of the epiglottis to the inferior border of the cricoid cartilage, where it becomes continuous with the oesophagus. Its anterior wall is the rear of the epiglottis and the posterior aspects of the arytenoid and cricoid cartilages. The piriform recess is part of the cavity of the laryngopharynx, situated on each side of the inlet of the larynx.
The superior constrictor emanates from the pterygomandibular raphe, the pterygoid hamulus and the buccinator ridge of the mandible. The right and left muscles run posteriorly and superiorly. Their superior attachment is to the pharyngeal tubercle on the base of the skull, and the largest part of the muscle meets its companion muscle from the opposite side to form a midline pharyngeal raphe.
The middle constrictor originates from the hyoid bone and the stylohyoid ligament and meets its partner at the pharyngeal raphe.
The inferior constrictor originates from the oblique line on the cricoid and thyroid cartilages. It meets its partner to contribute to the midline posterior raphe.
Finally, the stylopharyngeus muscle, beginning at the styloid process, runs into a gap between the upper and the middle constrictor and ends at the thyroid cartilage. All of these muscles are of the striated type.
Altogether, the tubulo-muscular wall of the pharynx consists of four layers: a mucous membrane, the pharyngeal aponeurosis, the muscle layer and the buccopharyngeal fascia.
The motor nervous and most of the sensory nervous supply to the pharynx is by way of the pharyngeal plexus, which, situated mainly on the middle constrictor, is formed by the pharyngeal branches of the vagus and glossopharyngeal nerves and also by sympathetic nerve fibres.
Blood supply of the pharynx is ensured by pharyngeal branches of the ascending pharyngeal artery, ascending palatine artery, descending palatine artery and pharyngeal branches of inferior thyroid artery. Veins collect the blood into the pharyngeal plexus.
22.214.171.124 The Larynx
The larynx is located in the anterior neck, ventrally of the cervical vertebrae 3–6. It connects the pharynx with the trachea and regulates the flow of air to and from the lungs for respiration and vocalisation and guards the air passages against food and liquids entering it. Its ventral prominence is called Adam’s apple. The larynx extends from the tip of the epiglottis to the inferior border of the cricoid cartilage. Its interior can be divided into three parts, the supraglottis, the transglottis and the subglottis (see below).
First is the thyroid cartilage, of hyaline nature. Its superior margin and its superior horn are attached to the hyoid bone by the thyrohyoid membrane, centrally and laterally enhanced as the medial or lateral thyrohyoid ligament. Its inferior horn connects to the cricoid cartilage and takes part in the cricothyroid articulation.
The hyaline cricoid cartilage is situated below the thyroid cartilage. It is the only one that encircles the entire larynx. It is attached to the thyroid cartilage via the median cricothyroid ligament and to the first ring of the trachea via the cricotracheal ligament.
Two mostly hyaline arytenoid cartilages of pyramidal shape are positioned dorsally on the superior margin of the cricoid cartilage. They are connected to the vocal ligaments by their vocal process, and their muscular process serves for muscular attachment. Each of them has an elastic corniculate cartilage on its top. The latter connect to the cricoid cartilage via the posterior cricoarytenoid ligament.
Behind the thyroid cartilage protrudes the epiglottis, a spoon-shaped elastic cartilage, which is connected to the thyroid cartilage by the thyroepiglottic ligament. It contacts the arytenoid cartilages via the quadrangular membrane, into which two elastic cuneiform cartilages are embedded.
The most prominent and most important ligaments of the larynx are the vocal ligaments, converging from the vocal processes of the arytenoids to the posterior surface of the thyroid. They serve as a margin for the conus elasticus, extending downwards to the cricoid cartilage.
The vocal folds extend from the angle of thyroid to the vocal processes of arytenoid cartilages. They are important for phonation by controlling the stream of air through the rima glottidis, the variable cleavage between them. They alter the shape and size of the wedge-shaped rima glottidis by movement of the arytenoids to ensure respiration or phonation.
Below the vocal folds, the subglottal space extends to the lower border of the cricoid cartilage.
The entire larynx is innervated by the vagus nerve: the nerve separates a superior branch that leaves the main trunk high in the neck. Approximately at the level of the hyoid bone, this superior laryngeal nerve divides into an external and an internal branch. The only function of the external branch is the motoric innervation of the cricothyroid muscle.
The internal branch passes through a foramen in the thyrohyoid membrane together with the superior laryngeal artery and vein. It provides general sensation, including pain, touch and temperature for the tissue superior to the vocal folds.
The lower part of the larynx is supplied by the recurrent laryngeal nerve. It contains motor fibres to innervate all the intrinsic muscles of the larynx—except for the cricothyroid muscle—as well as both sensory and secretory fibres to the glottis, subglottis and trachea. The right recurrent laryngeal nerve leaves the vagus nerve, which parallels the internal jugular vein, near the point where the brachiocephalic trunk divides. The left recurrent laryngeal nerve emanates from the vagus nerve near the aortic arch. Both branches cross dorsally below the adjacent vessel and ascend laterally next to the trachea. They often terminate in forming an anastomosis with the ipsilateral internal branch of the superior laryngeal nerve.
The larynx has its arterial supply from the superior laryngeal artery, a branch of the superior thyroid artery, which accompanies the internal laryngeal nerve, and by the inferior laryngeal artery from the inferior thyroid artery, which runs parallel to the recurrent laryngeal nerve.
126.96.36.199 The Tongue
The relaxed tongue takes up most of the space inside the oral cavity. It basically comprises muscles surrounded by a mucous membrane. The posterior one third of the tongue, the root, is attached to the floor of the oral cavity. The mobile anterior two thirds of the tongue is called the body, and the tip is the apex.
The surface, or dorsum, contains numerous projections of the mucous membrane called papillae. They contain taste buds, which can sense five types of sensations: sweet, salty, sour, bitter and umami, which is a savoury meaty flavour. In addition, serous glands of the mucosa secrete some of the fluid of the saliva.
The inferior surface of the tongue is covered by a thin transparent membrane. A large fold of mucosa, called the frenulum, runs down the midline. The ducts of the submandibular salivary glands open at the base of the frenulum.
Internal and external muscles of the tongue
Superior longitudinal muscle
Submucosal fibrous layer and septum
Margins of the tongue and mucous membrane
Curls the tongue upwards and shortens it
Inferior longitudinal muscle
Root of the tongue and hyoid bone
Curls the tongue downwards and shortens it
Septum of the tongue
Lateral margins of the tongue
Narrows and protrudes the tongue
Submucosal fibrous layer of the dorsum of the tongue
Inferior surfaces of the borders of the tongue
Flattens and broadens the tongue
Entire dorsum of the tongue and hyoid bone
Protrudes the tongue and assists with other movement
Inferior and lateral parts of the tongue
Depresses and shortens the tongue
Styloid process of temporal bone
Posterior parts of the tongue
Retracts the tongue and curls its sideways
Posterolateral parts of the tongue
Elevates the posterior part of the tongue and depresses the soft palate
All muscles of the tongue, except for the palatoglossus, are innervated by the hypoglossal nerve (CN XII). The palatoglossus is innervated by the pharyngeal plexus (CN X).
The tongue gains its blood by the lingual artery, a branch of the external carotid artery. It is drained by lingual veins, which continue into internal jugular vein.
Swallowing is a complex series of sequential neuromuscular events that are integrated into a smooth and continuous process, which is divided into three stages: oral, pharyngeal and oesophageal.
The oral phase of swallowing can be further subdivided into the oral preparatory and the oral transport phase. In the oral preparatory phase, the lips, tongue, mandible, palate and cheeks act in common with salivary flow to form food into a consistency and position appropriate for the subsequent phases of swallowing. Once the food bolus is prepared, the oral transport phase occurs, as the musculature of the lips and cheeks contract, followed by tongue contraction against the hard palate. The soft palate elevates as a consequence of contraction of the tensor veli palatini, levator veli palatini and palatopharyngeus muscles. Thereby a reflux of food into the nasal cavity is prevented.
The anterior two thirds of the tongue are critical in the oral phase of deglutition. The posterior one third of the tongue, the tongue base, plays an important role in propelling a food bolus posteriorly towards the pharynx.
The nerves involved so far are the trigeminal nerve (CN V) to control general sensation to the face and motor supply to the muscles of mastication, the facial nerve (CN VII) to supply taste to the anterior two thirds of the tongue and motor function to the lips, the glossopharyngeal nerve (CN IX) to provide general sensation to the posterior third of the tongue and the hypoglossal nerve (CN XII) to enable movements of the tongue.
Once the food bolus touches the palatoglossal folds, the pharyngeal phase of swallowing reflexively begins.
When the swallowing reflex is initiated, the following reactions take place: velopharyngeal closure to prevent reflux of material into the posterior choana. This is affected by contraction of the levator veli palatini muscles, which elevate the soft palate against the posterior nasopharyngeal wall. Medial contraction of the lateral pharyngeal wall musculature and a slight anterior movement of the posterior pharyngeal wall create Passavant’s ridge, against which the velum is approximated during the initiation of the pharyngeal phase of swallowing. The pharyngeal constrictor muscles contract in a superior-to-inferior direction. The epiglottis inverts to cover the larynx and prevent aspiration of contents into the airway. This retroversion of the epiglottis directs the food bolus laterally towards the pyriform sinuses. The vocal folds adduct to prevent aspiration.
With contraction of the superior pharyngeal constrictor muscle, laryngeal elevation occurs. The larynx elevates following the anterior movement of the hyoid bone and tongue base owing to contraction of the mylohyoid, geniohyoid, stylohyoid and anterior digastric muscles. This anterior movement of the larynx combined with the contraction of the middle and inferior constrictor muscles forces the food bolus inferiorly, initiating the final portion of the pharyngeal phase, which is the entry of the food bolus into the cervical oesophagus.
The mylohyoid nerve, branch of CN V3, supplies the mylohyoid and the anterior digastric muscles. The stylohyoid muscle is innervated by branches of the facial nerve (CN VII), and the geniohyoid muscle receives fibres from the first cervical nerve, which joins the hypoglossal nerve. The pharyngeal constrictors have their nervous supply through the glossopharyngeal (CN IX) and the vagus (CN X) nerves.
188.8.131.52 The Ear
The pinna collects sound waves and directs them to the external ear canal. Its shape also partially shields sound waves that approach the ear from the rear, therefore enabling a person to tell whether a sound is coming directly from the front or the back.
The external auditory canal (meatus acusticus externus) begins at the bottom of the concha and ends at the tympanic membrane. It is approximately 2.5–3 cm long and slightly S-curved. It is supported by cartilage at its first third and by the bone for the rest of its length. It exhibits two narrowings, one near the inner end of the cartilaginous portion and another, the isthmus, within the osseous part. The whole tube is lined by the skin and contains glands that produce secretions that mix with dead skin cells to produce cerumen (earwax).
Two muscles exert influence on the movements of the middle ear bones: the tensor tympani (innervated via a branch of the mandibular nerve), whose tendon inserts on the medial part of the malleus, pulls the malleus medially, tensing the tympanic membrane, damping its vibration and thereby reducing the amplitude of sounds. The other one, the stapedius muscle (innervated by a branch of the facial nerve), inserts into the posterior neck of the stapes and reflexively lessens its vibrations by pulling its head backwards.
The Organs of Equilibrium: Vestibulum and Semicircular Canals
The Cochlea (Fig. 1.43)
The basilar membrane and its medial continuation, the osseous spiral lamina, carry the organ of Corti, which changes pressure waves into nervous impulses.
The so-called cochlear transduction process of sound, from air waves to nervous impulses, is the result of several steps.
Vibrations of the eardrum are transmitted through the three ossicles of the middle ear, which induce pressure waves within the scala vestibuli. These pressure waves generate a travelling wave on the basilar membrane of the scala media. The basilar membrane vibrations cause shear movements between the tectorial membrane and the stereocilia of both types of hair cells, resulting in deflection of the stereocilia and activation of ion channels. Thereby the mechanical stimulus is transduced, and receptor potentials are generated. In inner hair cells, these receptor potentials induce neurotransmitter release and action potential generation in the synapses of auditory nerve fibres.
The receptor potentials of the outer hair cells, however, initiate a contraction in the longitudinal axis of the cells, which influences the basilar membrane’s motions (Fettiplace and Hackney 2006; Ashmore 2008). The cells act in a sense of a ‘cochlear amplifier’, a mechanism that increases both the amplitude and frequency selectivity of basilar membrane vibration for low-level sounds. These activities of the outer hair cells can be measured as otoacoustic emission (OAE). Efferent nerve fibres contact the outer hair cells by crossing the tunnel of Corti medially as radial fibres. They are thought to adjust the resting membrane potential of these cells, thereby regulating the amount of feedback provided to the basilar membrane.
184.108.40.206 Auditory Pathway and Vestibular Tracts
Each of the nerve fibres diverges, one branch projects rostrally to the dorsal cochlear nucleus, the other projects caudally to the ventral cochlear nucleus. The cochlear nuclei contain second-order neurons, which generally project to higher centres by an ipsilateral or, after decussating, by a contralateral pathway.
The ventral cochlear nucleus projects to the superior olivary complex, whereas fibres of the dorsal cochlear nucleus bypass the superior olivary complex and directly enter the lateral lemniscus to reach the inferior colliculus.
The superior olivary complex consists of the medial nucleus of the superior olive, the lateral nucleus of the superior olive and the medial nucleus of the trapezoid body. Both nuclei of the superior olive receive fibres from the ipsilateral and contralateral ventral cochlear nucleus. The medial nucleus analyses time differences of neuronal signals; the lateral nucleus evaluates differences in intensities. On their way to the nuclei of the superior olive, the contralateral fibres pass through the trapezoid body as a passive relay. In mammals as well as in man, its nuclei, however, are thought to play a role in distinguishing inter-aural intensity differences.
The superior olivary complex sends outputs to the cranial nerves V and VII for reflex contractions of the tensor tympani and stapedius muscles to dampen loud sounds.
The fibres connecting the olivary complex with the inferior colliculus form a lateral tract in the brainstem, called the lateral lemniscus. Within this tract a mass of grey matter is embedded, called the nuclei lemnisci lateralis dorsalis and ventralis. They serve as synaptic relay stations for some of the fibres of the lateral lemniscus.
The end point of the lateral lemniscus is the inferior colliculus. It serves as an auditory relay and reflex centre, where information derived directly from the dorsal cochlear nucleus and from the olivary complex may be compared. Moreover, it receives inputs from the somatosensory system. The inferior colliculi of both sides contact each other by commissural fibres.
The inferior colliculus projects to the medial geniculate nucleus or medial geniculate body. This nucleus is part of the thalamus and serves as a thalamic relay on the way to the auditory cortex. From their neuronal morphology, a number of subdivisions can be distinguished. They have different afferent and efferent connections, and they are thought to be involved together in the direction and maintenance of attention. The fibres leaving the medial geniculate body join the internal capsule as the radiatio acustica and terminate in the primary auditory cortex, the Brodmann’s areas 41 and 42 within the superior temporal gyrus.
Wave I is a response of the auditory nerve action potential in the distal portion of the cochlear nerve (cranial nerve VIII).
Wave II is generated by the proximal part of the cochlear nerve as it enters the brainstem.
Wave III arises from second-order neuron activities in or near the cochlear nucleus.
Wave IV is said to arise from the superior olivary complex, but additional contributions may come from the cochlear nucleus and nucleus of lateral lemniscus.
Wave V is believed to originate from the vicinity of the inferior colliculus.
Waves VI and VII are suggested to have their origins in the medial geniculate body of the thalamus.
Beginning in the auditory cortex, several descending pathways exist. The centrifugal fibres run close to, but usually not within, the tracts containing the auditory afferent pathways. They project to the medial geniculate bodies, the inferior colliculi and other midbrain nuclei; from the inferior colliculi, they descend to the superior olivary complex. From here the olivo-cochlear bundles travel down to the inner and outer hair cells of the cochlea. They carry information from the superior olivary complex, the nuclei of the lateral lemniscus, the reticular formation and the inferior colliculus. These bundles have two compartments: a lateral one, containing unmyelinated fibres, which end on the afferent fibres of the inner hair cells, and a medial one with myelinated fibres, which medially crosses the tunnel of Corti and terminates directly on the bodies of the outer hair cells.
The pathway of the acoustic startle reflex is part of the auditory pathway from the ear up to the nucleus of the inferior colliculus. Mainly from there but also from the dorsal nucleus of the lateral lemniscus and from parts of the superior olivary complex, it follows connections to nuclei of cranial nerves and motor centres in the reticular formation. The motoneurons of the ventral root are activated, and moving of the body is initiated, by the reticulospinal tract in the anterolateral part of the spinal cord. The inferior colliculi also project to the superior colliculi, important relay stations in the visual pathway, coordinating optical reflexes.
Learned reflexes (e.g. turning upon hearing one’s name) are conditioned reflexes. They include the auditory pathway and other complex parts of the central nervous system.
The Vestibular Tracts
Ascending fibres of the vestibular nuclei join the medial longitudinal fascicle and enter the interstitial nucleus of Cajal (Nieuwenhuys et al. 2008), which serves as a coordination centre for eye and head movements. Over this route, the vestibular nuclei also project to ipsilateral and contralateral nuclei of ocular muscles: nuclei of the oculomotorius, abducens and trochlear nerve. These nerves stimulate the six external muscles of the eye, located in three perpendicular planes, roughly colinear with the planes of the semicircular canals. Excitatory subunits connect a single semicircular canal to the eye muscles, initiating a compensatory eye movement in the plane of the semicircular canal, whereas their antagonists are inhibited.
The vestibulo-collic reflex produces head movements in the planes of the stimulated canals. More than 30 neck muscles are innervated from the upper cervical cord and receive excitatory or inhibitory inputs, or both, from all six semicircular canals. The motoneurons are contacted by the vestibulospinal system, including lateral, medial and crossed tracts.
Descending fibres form the vestibulospinal tracts and join the reticulospinal tract, by which they reach the motor neurons of skeletal muscles, predominantly activating the extensor and inhibiting the flexor muscles.
One main target of the vestibular tracts is the cerebellum. By mossy fibres, they reach a special region, the vestibular cerebellum, mainly consisting of the flocculonodular lobe, combining the nodulus and the flocculus. The efferent fibres of this region act synergistically on the oculomotor and spinal motor systems to maintain balance in upright movements (Trepel 2004).
220.127.116.11 Neuroanatomical Basics of Language
The cerebrum is divided into four main parts: the frontal lobe is separated from the parietal lobe by the central sulcus and houses capabilities such as planning, motivation, working memory and motor functions including speech production. The parietal lobe is the main sensory part of the brain. The gyri immediately adjoining the central sulcus are the precentral gyrus, origin of the pyramidal tract on the motoric site, and the postcentral gyrus as the primary sensory centre on the parietal side. The parietooccipital sulcus separates the parietal lobe from the occipital lobe, where among others the visual areas are domiciled. The lateral sulcus (or fissura sylvii) separates the temporal lobe on the one hand from the frontal and parietal lobe on the other. The auditory centre is located in the temporal lobe, where acoustic radiation, emanating from the medial geniculate body, terminates.
The whole cerebrum has been mapped by Korbinian Brodmann (1868–1918) according to the cytoarchitectonic patterns of its various parts. These areas are widely used to describe functional compartments of the brain.
Vocalising language requires motor outputs from the lower regions of the precentral primary motor cortex (motor areas for the face, mouth, tongue and larynx). The plan and coordination of these motor outputs originate in Broca’s area, which lies in the Brodmann’s regions 44 and 45. It joins the lower part of the precentral cortex, receives inputs from Wernicke’s area via the arcuate fasciculus and projects to the above-mentioned corresponding primary motor areas of both hemispheres (to the contralateral side via the corpus callosum), as well as to the basal ganglia, which are thought to play a role in giving timing cues for the correct motor activity during speech (Fig. 1.47).
1.3 Physics, Acoustics, Psychoacoustics
As phoniatrics is strongly linked to the phenomena of sound and sound perception, a thorough understanding of the nature of sound is essential for interpretation of diagnostic results in phoniatrics as well as for the application of therapeutic approaches to hearing and speech pathologies.
1.3.1 Sound Waves
In contrast to light waves or other electromagnetic waves present in our everyday environment, sound waves are mechanical vibrations of small particles (typically molecules) of a medium (e.g. air, water). Sound waves are created by a vibrating object, called the sound source. The vibration of a sound source results in a periodic or aperiodic compression and rarefaction of the particles of the medium, causing local fluctuations of the density that propagate through the medium. Hence, a sound wave can be thought of as a periodic or aperiodic displacement of interacting particles that travel through the medium from one location to another resulting in a transport of energy. If the particles vibrate along the axis of sound propagation, the wave is called longitudinal; if they vibrate perpendicular to the axis of propagation, the wave is called transverse. The medium can be thought of as the material through which the wave propagates, which can be a gas, a fluid or a solid substance. Without a medium sound cannot be produced.
Sound waves in air and water are always longitudinal, and they travel through the medium at a speed which is specific for the medium. In air the speed of sound at sea level is 340 m/s; in water it is more than four times higher, i.e. 1484 m/s, and in solids it is about 15 times higher. Most often the perception of sound in humans is based on the propagation of sound waves in air.
1.3.2 Physical Properties of Sound
Sound is primarily characterised by wavelength, amplitude and frequency. The wavelength (λ, lambda) is defined as the distance of two successive wave crests or troughs of a sound wave, the amplitude (A) as the maximum deviation from the resting pressure. The frequency (f), which determines the pitch of a tone, is defined as the number of pressure fluctuations per unit time. Frequency, wavelength and sound velocity (c) are related to each other by the following equation (Eq. 1.1):
The relevant measurement for characterising the strength of sound waves is sound pressure (p). It is defined as the local pressure deviation from the ambient atmospheric pressure caused by a sound wave. Its unit is the Pascal [Pa]. Pressure deviations caused by sound waves are very small relative to atmospheric pressure. The human ear is able to perceive them in the range between about 20 μPa (20 × 10−6 Pa) and 20 Pa, where 20 μPa approximately corresponds to the sound pressure at hearing threshold of a 1000 Hz tone. Owing to the huge range of our ear’s auditory sensitivity, measurements and calculations in Pa are laborious. For this reason, a more handy measure has been introduced: the sound pressure level (SPL) with the unit decibel [dB SPL]. The sound pressure level is calculated by a logarithmic transformation of the measured sound pressure according to the formula (Eq. 1.2):
According to this formula, the hearing threshold for a 1000 Hz tone lies at 0 dB SPL (as measured by a sound level meter). Every increase in sound pressure level by 20 dB indicates an increase in sound pressure by the factor 10. Owing to the logarithmic nature of the dB scale, its values cannot simply be added or subtracted but must be retransformed into Pa before arithmetic operations can be made. By doing so, one finds that doubling the sound pressure of a sound source leads to an increase in SPL of 6 dB.
1.3.3 Propagation of Sound
In gases and liquids, sound propagates as longitudinal waves, whereas in solids transverse propagation of sound waves is also found. It should be noted that propagation of sound does not mean the movement of particles but only transport of energy. During propagation, the waves can be reflected, diffracted or attenuated by the medium. The modalities of sound propagation depend on the properties of the medium through which the sound travels. Additional factors that influence sound propagation are obstacles along the wave’s pathway and changes of the type of medium.
The fact that sounds can be heard around corners and barriers involves the mechanisms of diffraction and reflection. When diffraction occurs, the sound can be thought of being ‘bent around’ an obstacle. This effect occurs mainly with low frequencies (for which the wavelength is of the order of the size of the obstacle) and is the reason we hear low frequencies better than high frequencies behind obstacles. With high frequencies (with wavelengths much smaller than the size of the obstacle) the so-called shadow effect occurs, meaning that they appear attenuated behind the obstacle. The shadow effect caused by our head helps our auditory system to recognise the direction from which high-frequency sounds come.
For recognising the direction whence low-frequency sounds originate, our auditory system evaluates the time difference between a sound’s arrival at the ipsilateral and the contralateral ear (inter-aural time difference).
Another property of a medium that is important for sound propagation is the characteristic acoustic impedance (Z). Z determines how much sound pressure (p) is generated by a vibration of molecules of a particular acoustic medium at a given frequency. It is defined as
To overcome the huge difference in impedances, impedance-matching is accomplished by the ossicles of the middle ear. This matching, interpreted as ‘mechanical amplification with no input of additional energy’, ensures that the sound is effectively transmitted from the ear canal to the fluids of the inner ear.
1.3.4 Perception of Sound and Psychoacoustics
The way sound is perceived in humans is described by psychoacoustics (Fastl and Zwicker 2007; Moore 1997). More specifically, psychoacoustics is the scientific field concerned with the psychological and physiological responses associated with sound, including speech and musical stimuli.
The range of frequencies perceived by the human ear reaches from 16 to 20,000 Hz in normal-hearing young individuals. The upper frequency limit of hearing declines with age, an effect called presbyacusis.
As the human ear is not equally sensitive to all frequencies, two tones are not perceived equally loud when presented at an equal sound pressure level. For instance, if a 1000 Hz tone is presented at a 20 dB level, it is experienced as loud as a 63 Hz tone presented at 60 dB SPL. Subjectively rated equal loudness levels of different tones are reported in specific curves called isophones or equal loudness contours. The unit of the loudness level of isophones is the phon, where the phon values at 1000 Hz equal the values of dB SPL. Whereas isophones exhibit a very steep dependence on frequency at low sound pressure levels, they show a much flatter dependence at high sound pressure levels.
Another way to judge the intensity of the perceived loudness is by using a ‘categorical’ scale that encompasses the whole range of categories by which we determine the loudness of everyday sounds. The loudness categories are defined in terms such as ‘inaudible’, ‘very soft’, ‘soft’, ‘medium’, ‘loud’ and ‘too loud’ as a judgement of loudness to stimuli presented at different sound pressure levels. Such scaling is particularly useful for the diagnosis of recruitment (pathological reduction of the auditory dynamic range) and for the fitting of hearing aids and hearing implants.
To account for the different sensitivity of the human ear to different frequencies, especially at low sound levels, an additional scale has been developed, the so-called dB (A) scale. The dB (A) scale weights the sound levels by a filtering function (A-filter), which approximates the sensitivity of the human ear at the isophone curve at 20–40 phon (see Fig. 1.50). By applying this filter function to a measured sound level, the low frequencies and the high frequencies contribute less to the total level of the weighted sound thereby mimicking the loudness perception of the human ear. The dB (A) scale is particularly used in quantifying levels of background noise for acoustic testing and for determining levels of noise exposure in natural or working environments.
1.3.6 Masking of Sound
The perception of sound signals, e.g. tones or speech, is also influenced by the presence of competing noise, an effect called masking. Masking plays an important role in everyday listening and in audiometry. Generally, the presence of a masking noise (called a masker) raises the hearing threshold of a tone. The masking effect is largest when the frequencies of the signal and of the masker are similar, and it decreases with increasing distance between their frequencies. However, the decrease is not symmetrical with respect to masker frequency: masking is more effective when a low-frequency sound masks a high-frequency signal than vice versa. In this case an effect called upward spread of masking occurs. To attain the opposite effect, the masker (i.e. the high-frequency sound) must be of an intense sound level. Owing to the nature of the human cochlea, upward spread of masking is more prominent and is important for understanding speech in noisy environments. It has to be particularly considered for the fitting of hearing aids and hearing implants.
Depending on the masker’s frequency spectrum and on its temporal properties (e.g. gaps between sounds), the perception of signals can be influenced in a variety of ways. Masking can also occur in a different temporal relationship from that of the signal: as either simultaneous masking, forward masking or backward masking.
1.3.7 Binaural Hearing
Compared with monaural hearing, binaural hearing allows the auditory system to make better use of the information contained in sounds. Main advantages of binaural hearing are (1) the ability to localise sound sources and (2) the ability to achieve improved speech intelligibility in noise.
Differences in arrival time of sounds between the right and the left ear (inter-aural time difference, ITD).
Differences in the levels of high-frequency sounds between the right and the left ear (inter-aural level difference, ILD).
Asymmetrical spectral reflections from various parts of the bodies, including torso, shoulders and pinnae.
For detecting the spatial origin of low-frequency (about <800 Hz) sound, the auditory system evaluates inter-aural time differences (ITD), e.g. a sound coming from the right side arrives earlier at the right ear than at the left ear. Contrarily, for high frequencies (about >1600 Hz), the inter-aural level differences (ILD) are predominantly evaluated by the brain. The ILDs are mainly due to the head shadow effect, as the level of sound from the right causes a higher sound pressure level on the right ear than on the left ear.
Localisation in the median plane is possible as the human outer ear (pinna and the ear canal) forms acoustic filters that are sensitive to the direction of the incoming sound. Different resonances of these filters cause direction-specific patterns into the frequency responses of the ears, which are evaluated by the brain. The combination of these patterns with other direction-selective reflections at the head, shoulders and torso forms the so-called head-related transfer functions (HRTF) which are specific for each individual. In closed rooms, not only the direct sound from a sound source reaches the ears but also sound that has been reflected from the walls. When localising a sound source under such conditions, the auditory system analyses only the original direct sound (which arrives first at the ear); not the reflected sound (which arrives later). This feature is called the ‘law of the first wave front’, as the analysis of the reflected sound is suppressed.
18.104.22.168 Speech Intelligibility in Noise
Hearing in noise is one of the more challenging tasks in today’s acoustic environment, as background noise is nearly always present in daily listening situations.
The presence of noise may cause substantial masking of acoustic signals and in particular the masking of speech, the most common being the cocktail party effect. For speech understanding in noisy environments, binaural hearing and proper functioning of the inner ear are crucial. The brain’s comparison of the right and left auditory inputs will then result in a significant internal noise reduction called binaural release from masking.
Speech intelligibility in noise is most often quantified by the signal-to-noise ratio, i.e. the ratio between the sound level of the signal (speech) and that of the noise at the point where the listener achieves 50% speech intelligibility. The signal-to-noise ratio is an important measure for assessing the benefit of hearing aids or cochlear implants. Besides advantages of localisation abilities, the improvement of hearing in noise is the main argument for a binaural supply with hearing prostheses.
1.4 Concise Overview of the Physiology of Hearing
1.4.1 Anatomy and Physiology of the Hearing Organ
The external part, comprising the auricle and external auditory canal. This part of the ear collects the airborne sounds and transports these sounds to the entrance of the middle ear, the tympanic membrane.
The middle ear is an air-filled cavity that comprises the tympanic membrane and (coupled) the middle ear ossicles. The middle ear system transforms the sound waves into mechanical vibrations that are led to the entrance of the inner ear, the oval window.
The inner ear or cochlea, a fluid-filled, spiral-formed organ that comprises the sensory cells. The mechanical vibrations of the middle ear system are propagated as vibrations through the cochlear fluids. These vibrations activate the sensory cells in the cochlea. In fact, the vestibular system and the cochlea form one organ often referred to as the audiovestibular system. The vestibular system is addressed in Sects. 1.2 and 16.15.
The auditory neural system that transports the action potentials generated by the activated sensory cells to the brainstem and (sub)cortical auditory areas.
If we look in more detail (Fig. 1.37 from Sect. 1.2), first of all at the outer ear, the auricle or pinna is characterised by certain folds, which play an important part in the processing of high frequencies that we use for localisation of sound sources, in particular in the vertical plane. Furthermore, the auricle is important for separating frontal and rear sources because its shape causes sounds from the front or rear to be diffracted differently, resulting in slight but detectable differences in the frequency spectrum. The tympanic membrane at the end of the external canal has relatively low impedance, comparable to the acoustic impedance of sound in air. As a result, airborne sound waves efficiently vibrate the tympanic membrane. Its low-power but relatively high-amplitude vibrations are transformed into more powerful vibrations with low amplitude, by the ossicular chain system. Thus this system works as an ‘impedance’ match that effectively transforms the acoustic energy of the airborne sound wave into mechanical energy of the fluid wave in the cochlea, a mechanical amplification procedure that requires no additional energy to perform. The ossicular system is connected to the tympanic membrane via the malleus at one end and to the cochlea at the other. The last ossicle, the stapes, moves piston-wise in the oval window of the cochlea (Slis and Snik 1997a).
At the base of the cochlea, the basilar membrane is relatively thick and narrow. At the apex, the basilar membrane is relatively thin and broad. Between these two extremes, the width of the membrane and its thickness change rather linearly. As a result, the stiffness of the membrane is much higher at the base than at the apex. For high frequencies, the basilar membrane is highly mobile near the base, while for low frequencies, the highest mobility is found near the apex. As a consequence, the mobility of the basilar membrane is frequency-dependent from the base to the apex; when stimulated, the amplitude of the travelling wave will peak at a certain distance from the base, where the distance is determined by the frequency of the sound. This is referred to as tonotopic organisation of the basilar membrane.
The organ of Corti is situated on top of the basilar membrane (see Fig. 1.44 taken from Sect. 1.2), close to the modiolus. Along the full length of the basilar membrane, this organ contains four parallel rows of hair cells, one row of inner hair cells (closest to the modiolus) and three rows of outer hair cells.
Inner and outer hair cells are situated on either side of a rather solid ridge, the tunnel of Corti. Each hair cell is connected to neurons; the inner hair cells mainly to afferent neurons and the outer hair cells mainly to efferent neurons. The differences in anatomy and neural connections suggest that inner hair cells and outer hair cells have different functions. Above the hair cells, the tectorial membrane is situated, which is (only) connected to the cochlear wall near the modiolus. The tips of the cilia of the hair cells are connected to the tectorial membrane, such that they bend when the basilar membrane moves relative to the stiff tectorial membrane. The connection of the cilia to the tectorial membrane is tighter for the outer hair cells than for the inner hair cells (Green 1976).
The endolymph, in the scala media, and the perilymph, in the other two scalae, have different ion concentrations. This causes a potential difference of about 80 mV over the membranes that separate the scala media from the other two scalae, the basilar membrane and membrane of Reissner. This difference in potential is kept constant by ion pumps in the stria vascularis, which covers the outer wall of the scala media. Positively charged ions are pumped from the perilymph to the endolymph. When the cilia of hair cells bend, an influx of positive ions causes depolarisation of the hair cell, which subsequently leads to an action potential in the afferent neuron connected to that hair cell. Inner hair cells are the main sensors. However, only loud sounds stimulate the cilia of the inner hair cells forcefully enough to cause depolarisation. Outer hair cells have another function; apart from sensory properties, they also have motor properties. These cells can amplify the relatively small movements of the basilar membrane caused by soft sounds (the ‘active mechanism’ or ‘cochlear amplifier’, which requires the input of additional energy via the efferent neurons), with subsequent sufficiently strong movements of the inner hair cells for depolarisation. These motoric actions of the outer hair cells are controlled by neural feedback loops (Moore 2008) (see Sect. 22.214.171.124) and are the main loci of the non-linearity of the cochlear response.
1.4.2 Perception of Sounds
126.96.36.199 Fundamental Concepts
Our perception of sound is based on logarithmic scales. Concerning loudness, this has led to the introduction of the decibel scale, which is a logarithmic measure of physical sound pressure levels. In pitch perception, we use the octave scale, which is a logarithmic measure of frequency.