CHAPTER 129 Physiology of the Auditory System
Sound and Its Measurement
Simple harmonic motion provides the framework for understanding acoustic energy (Fig. 129-1). Simple harmonic motion is a periodic motion that undulates around a null point with equal amplitudes, similar to a sine function. The frequency of a simple harmonic motion is the number of cycles per second and is measured in Hertz (Hz). The period of a cycle is the inverse of its frequency (1/f), and represents the duration of a single cycle. The amplitude is the maximum amount of displacement from the null point in one direction. The sound produced by simple harmonic motion is called a pure tone. In the everyday environment, most sound sources do not produce sounds that follow simple harmonic motion. Any vibration that does not follow simple harmonic motion is said to be complex. If there is a repetitive periodic pattern to the complex vibration, it produces tones. If the complex vibration has no repetitive pattern, it results in noise.
Because pressure varies as the square root of intensity, it is necessary to square the pressures in the decibel formula when sound pressure is being used. The formula for determining decibels for sound pressure is dB = 10 log10 P2/Pr2 = 10 log10 (P/Pr)2 = 20 log10 P/Pr, where P is the sound pressure of interest, and Pr is the reference sound pressure. If the sound of interest has 10 times more pressure than the reference sound pressure, the sound of interest is 20 dB louder than the reference. If the sound of interest has 100 times more pressure, it is 40 dB louder than the reference. The most commonly used reference sound pressure is 20 µPa, which is referred to as sound pressure level (SPL). Another reference sound pressure that is occasionally used is hearing level (HL), which is the threshold sound pressure at a specific frequency (as measured in normal subjects); this threshold sound pressure varies across the frequency range.
External Ear
Sound localization is achieved by two major mechanisms: interaural time difference and interaural amplitude difference. Because the left and the right ears are located on opposite sides of the head, the amount of time it takes for a sound stimulus to arrive at each individual ear is governed by the distance of the sound source to that particular ear: the farther the distance, the longer it takes for the sound stimulus to arrive. The differences in the time arrival of the sound stimulus between the two ears can be used as a cue for sound localization. The differences in amplitude perceived by the two ears can also be used as a cue for sound localization.1,2 This difference in amplitude is increased further by the “head shadow” effect: sound coming from one side is attenuated by the head as it travels to the contralateral ear. The head shadow effect in binaural hearing helps to improve the signal-to-noise ratio in adverse listening environments; one ear can be closer to the source of sound or speech, whereas the contralateral ear is exposed to the background noise. It has been shown that the interaural time difference is important for low-frequency sound localization, whereas the interaural amplitude difference is important for higher frequency sound localization.3
Middle Ear Mechanics
The middle ear is composed of the tympanic membrane, the ossicles (malleus, incus, and stapes), and the stapedius and tensor tympani muscles. As a sound stimulus enters the external auditory canal, it causes the tympanic membrane to vibrate. The malleus, which is coupled to the tympanic membrane, vibrates in response to the motion of the tympanic membrane. This causes the entire ossicular chain to vibrate, resulting in sound transmission to the inner ear via the stapes footplate. This pathway of sound transmission is referred to as ossicular coupling.4 The ossicular chain has two synovial joints that are mobile: the incudomalleal and the incudostapedial joints.5 The ossicular chain vibrates along an axis that projects through the head of the malleus and the body of the incus in an anterior-to-posterior direction (Fig. 129-2). The stapes, the smallest bone in the body, transmits the output of the middle ear into the inner ear through the oval window.
Because the inner ear is fluid-filled, if the sound stimulus strikes the inner ear fluid directly, most of the acoustic energy is deflected, as the impedance of fluid is much greater than the impedance of air. The pathway of sound transmission to the inner ear in the absence of the ossicular system is referred to as acoustic coupling.4 It has been shown that the difference between ossicular coupling and acoustic coupling is about 60 dB, which is the maximal amount of hearing loss expected in patients with ossicular discontinuity.6 The middle ear plays an important role in the process of “impedance matching” between the air-filled middle ear and the fluid-filled inner ear, allowing for efficient sound transmission. The most important factor in the middle ear’s impedance matching capability comes from the “area ratio” between the tympanic membrane and the stapes footplate (see Fig. 129-2). The human tympanic membrane has a surface area approximately 20 times larger than the stapes footplate (69 vs. 3.4 mm2).7 If all the force applied to the tympanic membrane were to be transferred to the stapes footplate, the force per unit area would be 20 times larger (26 dB) on the footplate than on the tympanic membrane.
A second mechanism for impedance matching is called the lever ratio, which refers to the difference in length of the manubrium of the malleus and the long process of the incus. Because the manubrium is slightly longer than the long process of the incus, a small force applied to the long arm of the lever (manubrium) results in a larger force on the short arm of the lever (incus long process). In humans, the lever ratio is about 1.31 : 1 (2.3 dB).8 The combined effects of the area ratio and the lever ratio give the middle ear output a 28-dB gain theoretically. In reality, the middle ear sound pressure gain is only about 20 dB9; this is mostly due to the fact that the tympanic membrane does not move as a rigid diaphragm. At higher frequencies, it vibrates in a complex manner, with multiple areas that vibrate differently.10 The effective area of the tympanic membrane involved with impedance matching is smaller than its total area. Nevertheless, the 20-dB middle ear sound pressure gain assists sound transmission from the air-filled middle ear into the fluid-filled inner ear.
Inner Ear Physiology
The inner ear is enclosed in a bony cavity called the otic capsule. It has two mobile windows: the oval and the round windows. The two important functions of the inner ear are hearing and balance. The portion of the inner ear that deals with hearing is the cochlea, and the portion of the inner ear that deals with balance is collectively known as the vestibular organs (semicircular canals, utricle, and saccule). The cochlea is shaped like a snail, and has a spiral configuration with two and a half turns (Fig. 129-3A). The center portion of the spiral is called the modiolus. The portion of the cochlea that is closest to the oval window is referred to as the base, whereas the portion of the cochlea that is farthest away from the oval window is referred to as the apex. The cochlea is a fluid-filled space with three compartments: scala tympani, scala media, and scala vestibuli (Fig. 129-3B). The scala tympani and the scala media are separated by the basilar membrane, and the scala media and the scala vestibuli are separated by Reissner’s membrane. The scala tympani and the scala vestibuli join together at the apex of the cochlea to form the helicotrema.
The scala vestibuli and the scala tympani are filled with perilymph, which has a similar composition to the extracellular fluid (high in sodium, low in potassium) (Fig. 129-4A). The scala media is filled with endolymph, which has a similar composition to the intracellular fluid (low in sodium, high in potassium).11 The unique electrolyte composition of the scala media sets up a large electrochemical gradient, called the endocochlear potential, which is +60 to +100 mV relative to the perilymph (Fig. 129-5).12 The maintenance of such a large electrochemical gradient is performed by the stria vascularis, which resides on the outer wall (away from the modiolus) of the scala media. The stria vascularis contains multiple active ion channels, and maintains the chemical composition of the endolymph and its positive electrical potential.13
von Bekesy12 first described the vibration of the cochlear partition in cadaveric human cochleas. He showed that as the cochlear partition is deflected by the compressional wave created by the stapes footplate vibration, it sets up a traveling wave on the basilar membrane, which travels from the base of the cochlea to its apex (Fig. 129-6). von Bekesy12 also found that the basilar membrane varied in its stiffness along its length, with higher stiffness near the base and lower stiffness near the apex. This property of the basilar membrane allows it to respond to various frequencies differently (i.e., the amplitude of the traveling wave peaks [resonates] at a specific place along the basilar membrane), with the higher frequencies at the base and the lower frequencies toward the apex. The basilar membrane is able to act as a series of filters, responding to specific sound frequencies at specific locations along its length. In other words, the basilar membrane is tonotopically tuned to different frequencies along its length. von Bekesy’s seminal work on cochlear mechanics earned him the Nobel Prize in Physiology or Medicine in 1961.
Although the cochlea is usually thought of as having two mobile windows (the oval and round windows), the possibility of a third mobile window has been proposed.14 More recently, a select group of patients with an air-bone gap on audiologic testing (which is typically associated with middle ear pathology) who do not have any middle ear pathology on intraoperative exploration has been described.15–17 It has been shown that the air-bone gap in this group of patients can be explained by a pathologic “third window” in the inner ear.18 Perhaps the most well-studied example of this phenomenon is seen in superior semicircular canal dehiscence (Fig. 129-7; video clip available on website).
Patients with superior semicircular canal dehiscence often complain of autophony, aural fullness, sound-induced or pressure-induced vertigo, and hearing loss.19–22 It is thought that the dehiscence in the superior semicircular canal acts as a third mobile window of the inner ear (in addition to the oval and round windows), shunting acoustic energy away from the cochlea, resulting in a decreased sensitivity to air-conducted sound and an air-bone gap seen on audiologic testing.20,23–25 This third window is also theorized to decrease cochlear input impedance at the oval window, increasing the pressure gradient across the cochlear partition, and resulting in hypersensitivity to bone-conducted sound.23 Repair, or plugging, of superior semicircular canal dehiscence in some cases results in improvement of the preoperative air-bone gap (see Fig. 129-7). The third window hypothesis has also been used to explain the air-bone gap associated with other temporal bone anomalies, such as large vestibular aqueduct syndrome and other inner ear malformations.26–28
As the cochlear partition is deflected in response to the compressional wave initiated by the stapes, it causes a shearing force between the stereocilia of the hair cells and the tectorial membrane. This shearing force causes a deflection of the hair cell stereocilia. The hair cell stereocilia are arranged in rows, and the rows are arranged in an orderly fashion by height (see Fig. 129-4B). The tip of each stereocilia is connected from one row to another by an elastic filament called the tip link.29 It is thought that as the stereocilia is deflected toward the direction of the tallest row, it causes the tip links to stretch. The stretch of the tip links causes the opening of stretch-sensitive cationic channels located on the stereocilia (Fig. 129-8).
Because there is a large electrochemical gradient across the apical surface of the hair cells (with a large positive endocochlear potential on one side, and a large negative intracellular potential on the other side), the opening of these stretch-sensitive cationic channels on the stereocilia causes a large influx of cationic current, which leads to hair cell depolarization. As the stereocilia is deflected away from the tallest row, the tip links relax, decreasing the probability of ion channel opening; this leads to hyperpolarization of the hair cell.30–32 The relationship between the degree of stereocilia deflection and hair cell depolarization or hyperpolarization is not symmetric or linear. Stereocilia deflection in the depolarization direction produces a greater response than deflection in the hyperpolarization direction (Fig. 129-9).30 The deflection of the hair cell stereocilia and the resulting hair cell depolarization or hyperpolarization represents an important step in the signal transduction process of the hair cell (i.e., by converting a mechanical signal [inner ear fluid wave] into an electrochemical signal).
Because potassium is the major cation in the endolymph, it is believed that potassium current plays an important role in triggering the signal transduction process in hair cells. When inner hair cells are depolarized, voltage-gated calcium channels open.33