Overview of Diagnostic Imaging of the Head and Neck

CHAPTER 11 Overview of Diagnostic Imaging of the Head and Neck




Diagnostic medical imaging has changed medical and surgical diagnosis in ways never imagined. Every area of clinical medicine has been affected in a profound way. Radiologists are able, through their consultations, to assist the otolaryngologist in a variety of ways, including providing primary diagnosis, confirming a clinical impression, evaluating regional anatomy and extent of disease, assessing response to treatment, and assisting in definitive treatment of patients.


Neuroradiologists are subspecialty trained radiologists who specialize in the imaging of the head and neck, skull base, temporal bone, brain, and spine. They are the primary imaging consultants for otolaryngologists.


This chapter provides an introduction and overview of head and neck imaging for the otolaryngologist. The various available imaging modalities are discussed. Imaging strategies for various regions and clinical questions are reviewed. The basic approach to the radiologist’s image acquisition and interpretation are described so that the referring physician will gain a measure of understanding of this field. This is intended to maximize the usefulness of diagnostic imaging in the care of patients.


The scope of head and neck imaging is too broad a topic to be covered in one chapter. The authors provide the clinician with an outline and brief synopsis of the field. Definitive textbooks for each area of head and neck imaging are available.14



Available Imaging Modalities



Conventional Radiography


Since the discovery of the x-ray, conventional radiography (CR) has been used in imaging the head and neck region. CR in the head and neck imaging has been largely replaced by computed tomography (CT). The traditional projections obtained with CR that are applicable to head and neck imaging are described here.






Computed Tomography


CT was developed for clinical use in the mid-1970s by Hounsfield. CT scanners have evolved over time such that the most advanced scanners now scan in a “helical” fashion, in which the scanner uses a slip-ring technique. This allows the table to move as the scan is performed, resulting in complete volumes of tissue being imaged without skipping tissue between slices. The ratio of the speed at which the table moves to the time it takes for a complete rotation of the CT tube is called the pitch. Multidetector CT allows scanning of multiple slices during one rotation of the CT tube which significantly increases the speed of examination, which in turn makes it possible to obtain much thinner slices in a short time. At present, CT scanners can obtain slices 0.5 mm thick. A neck examination from skull base to the mediastinum can be performed in less than 30 seconds. More importantly, enhanced computer technology permits real-time manipulation of these data, allowing multiplanar and various forms of three-dimensional reconstructions without image degradation. Images can be reconstructed in any desired plane regardless of the plane at which they are acquired.


CT uses a tightly collimated x-ray beam that is differentially absorbed by the various body tissues to generate highly detailed cross-sectional images. The degree of attenuation of the x-ray photons for each voxel (smallest imaging unit) is assigned a numeric readout. These units of attenuation are known as Hounsfield units (HU) and generally range from −1000 HU to +1000 HU. Water is assigned a value of 0 HU. Fat is approximately −80 to −100 HU. Calcium and bone are in the 100 to 400 HU range, and most fluids are in the 0 to 30 HU range.


To create images from the attenuation values, CT uses complex mathematical reconstruction algorithms. Bone disease and bone trauma are best visualized with a bone detail algorithm whereas a soft tissue algorithm is used to evaluate soft tissue structures (Fig. 11-1).




Computed Tomography Image Display


Images from a given reconstruction algorithm can be displayed in various ways to highlight differences in attenuation of different structures. The window width refers to the range of attenuation values in HU that make up the gray scale for a given image. The window level refers to the center HU value for that given window width. A narrow window width of 80 HU and a level of 40 HU is frequently used for brain imaging because it centers the density at the common density of brain tissue, and displays only those densities 40 HU greater than and 40 HU less than the window level. Thus any density greater than 80 HU will be displayed as white, and any density less than 0 will be displayed as black on the gray scale. Any intermediate density will be spread out evenly along the gray scale. For imaging of the soft tissues of the head and neck, a window level of approximately 40 to 70 HU is usually chosen, at a midpoint approximately equal to the density of muscle. The window width frequently is in the 250 to 400 HU range, thus displaying a wider range of densities including calcification, intravenous contrast, muscle, and fat to best advantage. For imaging bony structures such as paranasal sinuses and temporal bone, window levels from 0 to 400 HU and a wide window width of 2000 to 4000 HU may be chosen. The reason for a wide bone window width is that a wide range of densities ranging from cortical bone (approximately +1000 HU) down to gas (−1000 HU) need to be displayed on the same image. However, structures of intermediate density between bone and gas occupy a narrow range on the gray scale at this window width and are poorly discriminated (appear washed out) on these settings. The terminology commonly used to describe the previously mentioned windows includes soft tissue windows (window width of 250 to 400 HU) and bone windows (2000 to 4000 HU). It is important to understand that these display windows are completely independent of the mathematical imaging algorithm chosen for creation of the image. In other words, an image created by a soft tissue algorithm can be displayed with soft tissue and bone window widths (see Figs. 11-1A and C). Conversely, the image may be computer reconstructed using a bone algorithm and displayed with either soft tissue or bone window width (see Fig. 11-1B and D). To optimize the imaging of the soft tissue lesion and the adjacent bone, a soft tissue and a bone algorithm may be used, generating images with the appropriate soft tissue and bone windows. (See also Figs. 11-12A and C).


Multiple options for displaying the image (adjusting the window level and width parameters on the imaging console) and recording it permanently on radiographic film or other media are available. Picture archiving and communication systems (PACS) are becoming the norm for image display and storage.



Patient Cooperation


Patient cooperation is necessary to obtain optimal image quality. The patient is instructed not to swallow and to stop breathing or to maintain quiet breathing during each slice acquisition to minimize motion artifact from the adjacent airway and pharyngeal structures. Occasionally, provocative maneuvers such as blowing through a small straw or using a cheek-puffing (modified Valsalva) maneuver to distend the hypopharynx, or phonating to assess vocal cord movement, may be necessary (Figs. 11-2 and 11-3).




Contrast enhancement often is used to opacify blood vessels and to identify regions of abnormal tissue as identified by abnormal enhancement patterns (Fig. 11-4). As it relates to head and neck imaging, contrast is particularly useful in CT scans of the neck and orbits. Contrast often is not needed in evaluation of the temporal bones, although it can be necessary on occasion. CT of the facial bones and paranasal sinuses does not require intravenous contrast for most common applications.




Radiation Exposure


As a brief review, the radiation exposure (dose) that a patient receives is known as the radiation absorbed dose. This radiation absorbed dose is a measure of the total radiation energy absorbed by the tissues, and it is expressed in an international system (SI) unit known as the Gray (Gy). One Gy is the amount of radiation needed to deposit the energy of 1 Joule (J) in 1 kg of tissue (1 Gy = 1 J/kg). Formerly, the unit used to express radiation absorbed dose was the rad (1 rad = amount of radiation needed to deposit the energy of 100 ergs in 1 g of tissue). The conversion of rads to Gy is 1 Gy = 100 rad.


Radiation dose equivalent is a more useful term as it considers the “quality factor” (Q) of the radiation involved (radiation dose equivalent = radiation absorbed dose Q). The quality factor considers the varying biologic activity of various types of ionizing radiation. For x-rays, Q = 1. Thus when discussing diagnostic x-rays, the radiation dose equivalent equals the radiation absorbed dose. The SI unit for the radiation dose equivalent is the Sievert (Sv). The former unit was the Roentgen equivalent for man (rem). In summary, 1 Gy = 1 Sv, and 1 Sv = 100 rem.


The radiation dose equivalent depends on the choice of tube voltage and current settings (kVp and mAs), slice thickness, pitch, and gantry cycle time. For a given kVp, the radiation dose equivalent will vary linearly with the mAs. The actual dose will vary slightly among machines. The radiation dose equivalent for a CT examination can be considerably reduced using a low-mAs technique.


The effective dose equivalent was developed as a means of representing the fraction of the total stochastic risk of fatal cancers and chromosomal abnormalities resulting from the irradiation of a particular body part. A system of weighting is used to consider the individual susceptibility of the body’s major tissues and organs. A full discussion of this is beyond the scope of this chapter. Suffice to say that, for a given examination, the effective dose to the patient is less than the dose (radiation dose equivalent) received by the area under examination. Table 11-1 lists common radiologic procedures and their effective dose equivalents.


Table 11-1 Estimated Effective Dose Equivalent of Common Examinations





















Examination Effective Dose Equivalent
Chest radiograph 20 mrem
CT, abdomen 1000 mrem
CT, chest 1000 mrem
CT, brain 120 mrem
CT, sinus 70 to 130 mrem

From Nationwide Evaluation of X-Ray Trends (NEXT) 2000 Survey of Computed Tomography. Frankfort, KY: Food and Drug Administration, Center for Devices and Radiological Health; August 2007. CRCPD Publication E-07-2.



Magnetic Resonance Imaging


Magnetic resonance imaging (MRI) is an imaging modality that uses the response of biologic tissues to an applied and changing magnetic field to generate images. It is not possible to completely describe the principles of MRI in an introductory chapter of all head and neck imaging. A brief summary of MRI follows.


Two types of magnets are used to perform clinical MRI: permanent and superconducting. Permanent magnets do not require continual input of energy to maintain the magnetic field. They are composed of large magnetic metallic elements set up to generate a uniform magnetic field between components. Superconducting magnets are electromagnets usually composed of niobium-titanium wire. They require input of energy to start them, but once they are up to strength, they are maintained in a superconductive state by means of an encasing system of liquid nitrogen and liquid helium shells.


The earth has a magnetic field strength of 0.5 Gauss (G). The tesla (T) is another unit of magnetic strength that is related to G by the equation 1 T = 10,000 G. Clinical MRI units usually operate at magnetic field strengths of between 0.3 and 3.0 T. Small-bore research scanners of strengths up to 9.0 T are in use.


Many MR pulse sequences are available to generate images. The most common pulse sequences in MRI are the spin-echo and gradient echo techniques.


MRI is one of the most active areas of development and research within diagnostic radiology. MRI derives its signal from hydrogen protons, most abundant in tissue fat and water. When placed in a high magnetic field, the spinning protons are aligned in the direction of the magnetic field. Radio frequency pulses are transmitted into the subject to excite the spinning protons, changing their orientation with respect to the magnetic field. As the protons realign with the magnetic field, they lose energy and give off a signal, which is picked up by coils and reconstructed into an image. The quality of MRI depends on a high signal-to-noise ratio, which is used to improve image contrast and spatial resolution. In general, the higher the field strength of the magnet, the higher the signal-to-noise ratio.


A surface coil is a receiving antenna for the radio frequency signal that is emitted from the imaging subject after the initial radio frequency stimulation. The standard head coil is usually adequate for studying head and neck disease above the angle of the mandible. A head coil allows imaging of the adjacent brain and orbits, an advantage when head and neck lesions extend intracranially. Neck coils cover a larger area from the skull base to the clavicles and come in various configurations, for example, volume neck coil and anterior neck coil. Surface coils significantly improve the quality of head and neck imaging by more effectively collecting the signal hence increasing the signal-to-noise ratio but they are able to collect signal from a smaller body part.


Slice thickness on MRI is most commonly 5 mm, with 3-mm sections used for smaller regions of interest. However, a thinner slice has a smaller signal-to-noise ratio. Occasionally, 1- to 2-mm sections may be needed for small structures (e.g., facial nerve), requiring a volume acquisition technique. The number of slices is limited in MRI (as opposed to CT) by the specific sequence used. Covering the entire neck from skull base to superior mediastinum often requires that two separate acquisitions be obtained.



Magnetic Resonance Imaging Artifacts


Motion artifact, chemical shift artifact, susceptibility artifacts from metallic implants (e.g., amalgam, orthodontic implants), and eyelid mascara degrade MRI (Fig. 11-5). Motion artifact becomes more prominent with increased field strength, increased length of individual pulse sequences, and the total length of the imaging study. A typical imaging sequence may last from 2 to 8 minutes. To limit motion artifact, sequences fewer than 4 minutes are preferred, and the patient should be instructed not to swallow and to breathe shallowly and quietly.



Chemical shift artifact arises from the differences in resonance frequencies of water and fat protons. The result is an exaggerated interface (spatial misregistration) in areas where fat abuts structures containing predominantly water protons such as the posterior globe or a mass. Chemical shift artifact may produce the appearance of a pseudocapsule around a lesion or cause obscuration of a small-diameter structure such as the optic nerve. Chemical shift artifact may be identified as a bright band on one side of the structure and a black band on the opposite side. This is usually most noticeable on T1-weighted images (T1WIs).


Metallic artifact from dental work varies in severity depending on amount and composition of the metal in the mouth, as well as the pulse sequence and field strength of the MRI scanner. Most dental amalgam causes mild distortion to the local magnetic field, resulting in a mild dropout of signal around the involved teeth. Extensive dental work, metallic implants, and braces may cause more severe distortion of the image, precluding visualization of the maxilla, mandible, and floor of the mouth. Mascara containing metallic compounds can also cause localized signal loss in the anterior orbit and globe.



Magnetic Resonance Imaging Pulse Sequences


Numerous pulse sequences are available on clinical MRI units. The details of the physics of MRI may be found in most radiology and MRI textbooks. Commonly used imaging protocols include T1-weighted, spin (proton) density, T2-weighted, gadolinium-enhanced T1-weighted, fat-suppressed, and gradient echo imaging. Magnetic resonance angiography (MRA) is infrequently obtained (Figs. 11-6 and 11-7). The abbreviations used to identify sequence parameters are repetition time (TR), echo time (TE), and inversion time (TI) and are measured in milliseconds. The following description of pulse sequences is intended to assist the clinician in identifying and understanding the commonly performed sequences and in determining their respective use in the head and neck.





T1-Weighted Images


T1-weighted (short TR) sequences (see Figs. 11-6A and 11-7A) use a short TR (500 to 700 msec) and a short TE (15 to 40 msec). T1-weighted imaging is the fundamental head and neck sequence because it provides excellent soft tissue contrast with a superior display of anatomy, a high signal-to-noise ratio, and a moderate imaging time (4 to 5 min), minimizing motion artifacts. Fat is high signal intensity (bright or white) on T1WIs and provides natural contrast in the head and neck. Air, rapid blood flow, bone, and fluid-filled structures (e.g., vitreous and cerebrospinal fluid [CSF]) are low signal intensity (dark or black) on T1WIs. Muscle is low to intermediate in signal intensity on T1WIs. The inherent high contrast of fat relative to adjacent structures allows excellent delineation of the muscles, globe, blood vessels, and mass lesions that border on fat. The cortical bone is black, and the enclosed bone marrow is bright from fat within the marrow. The aerated paranasal sinuses are black, whereas retained mucous or mass lesions are of low to intermediate signal intensity. Most head and neck mass lesions show a comparable signal to muscles on T1WIs. (To quickly identify a T1WI: fat is white, CSF and vitreous are black, and nasal mucosa is low signal.)



T2-Weighted Images


T2-weighted images (see Fig. 11-6C) use a long TR (2000 to 4000 msec) and a long TE (50 to 90 msec) and are sometimes referred to as long TR/long TE images. Note that spin density and T2WI are acquired simultaneously from a single sequence that produces two sets of images with the same TR but different TEs. For example, spin density = 2000/30 and T2WI = 2000/80. T2WIs are most useful for highlighting pathologic lesions. T2WIs show the vitreous and CSF as high signal intensity (bright) relative to the low to intermediate signal intensity of head and neck fat and muscle. Fat loses signal intensity with increased T2 weighting. Most radiologists use a fast spin-echo (FSE) T2WI for head and neck imaging, which provides a much faster acquisition with improved signal-to-noise. Fat remains bright, however, on FSE images. Most head and neck masses are higher signal intensity on a T2WI compared with their low-to-intermediate signal intensity on T1WI. The combination of the T1WI and T2WI is often useful for characterizing fluid-containing structures, solid components, and hemorrhage. Bone, rapid vascular flow, calcium, hemosiderin, and air-containing sinuses are black. Inflammatory sinus disease and normal airway mucosa appear very bright. (To quickly identify a T2WI: CSF, vitreous, and nasal mucosa are bright. Muscle is low to intermediate in signal.)



Gadolinium Enhancement


Gadolinium-based contrast material is used in conjunction with T1WI sequences (gadolinium shortens the T1) and, with the dose used, it has little effect on T2WI. The advantages of contrast enhancement are increased lesion conspicuity and improved delineation of the margins of a mass relative to the lower signal of muscle, bone, vessel, or globe.5 However, gadolinium enhancement (without concomitant fat suppression) has had limited usefulness within the head and neck, as well as in the orbit, because of the large amount of fat present within these regions (see Fig. 11-6D). After gadolinium injection, the signal increases within a lesion, often obscuring the lesion within the adjacent high signal intensity fat.6 Therefore, for head and neck imaging, gadolinium is optimally used with specific fat suppression techniques that turn fat dark or black. Gadolinium enhances normal structures including nasal and pharyngeal mucosa, lymphoid tissue in Waldeyer’s ring, extraocular muscles, and slow-flowing blood in veins, all of which may appear surprisingly bright, especially if combined with fat suppression techniques. (To quickly identify a gadolinium-enhanced T1WI: nasal mucosa is white, fat is white, and CSF and vitreous are black. Also look for Gd-DTPA printed directly on the image or on adhesive study labels.)



Fat Suppression Methods


Several sequences have been developed that suppress fat signal intensity. T2WIs, short TI inversion recovery (STIR), spectral presaturation inversion recovery (SPIR), and chemical shift selective presaturation (fat saturation) are some of the more common clinically available methods of fat suppression. One advantage of fat suppression is reduction or elimination of chemical shift artifacts by removing fat signal from the image while preserving water signal. Additionally, some fat suppression techniques take advantage of gadolinium enhancement by eliminating the surrounding high intensity signal from fat while retaining the high intensity enhancement produced by gadolinium. Most pathologic lesions have increased water content, and gadolinium exerts its paramagnetic effects while in blood vessels and in the increased extracellular fluid of the lesion, but gadolinium does not enhance fat. The fat signal can be manipulated in the following ways:


1. STIR (see Fig. 11-7E) provides reliable fat suppression over large body parts.7 The inversion time (e.g., TI = 140 msec) is individually “tuned” for each patient to place fat at the null point of signal intensity and thus eliminates fat signal by turning it completely black. STIR images show the mucosa, vitreous, and CSF as very high signal intensity. Most mass lesions in the head and neck have similar high signal intensity on STIR and T2WI. The disadvantages of STIR are image degradation secondary to a decreased signal-to-noise ratio, and increased vulnerability to motion artifacts including the vessel pulsations. Additional disadvantages of STIR, such as increased scan time and fewer slices, are circumvented by the recently available fast sequences. (To quickly identify a STIR image: fat is almost completely black; CSF, vitreous, and mucosa are very bright. A TI is listed with the TR and TE times on the image.)

2. Frequency selective presaturation sequences (see Fig. 11-7B) typically used with a spin-echo technique selectively suppress fat signal. (Note that for the remainder of this chapter, the terms fat suppression and fat saturation are used interchangeably and refer to frequency [chemical shift] selective presaturation techniques). T1-weighted fat saturation sequences take full advantage of gadolinium enhancement. A gadolinium-enhancing lesion within the head and neck retains its high signal intensity and is not obscured, because fat is suppressed to become low to intermediate signal intensity. Enhancing masses within the head and neck and orbit are particularly well imaged with this technique.8 Frequency selective fat suppression is also complementary for FSE T2WIs (see Figs. 11-7C and D). Fat-saturated T2WIs provide excellent fat suppression, optimizing the high signal from normal structures and lesions that are high in water content contrasted against a black background of fat. The disadvantages of fat saturation sequences are that non–gadolinium-enhancing lesions may be less well discriminated, that these sequences are more susceptible to artifacts, and that nonuniform fat suppression can occur. Also, fewer slices are acquired than with T1WI, unless the TR time is lengthened, which prolongs imaging time. (To quickly identify a gadolinium-enhanced T1WI with fat saturation: mucosa and small veins are white, fat is low to intermediate intensity, and CSF and vitreous are black.)


Gradient Echo Techniques


Numerous gradient echo sequences are available that have a variety of applications. Gradient echo scans have a very short TR (30 to 70 msec), a very short TE (5 to 15 msec), and a flip angle of less than 90 degrees. They have a variety of proprietary acronyms, including GRASS, MPGR, SPGR, FLASH, and FISP. Gradient echo sequences take advantage of the phenomenon of flow-related enhancement. That is, any rapidly flowing blood appears extremely bright. These sequences are useful for localizing normal vessels, detecting obstruction of flow in compressed or thrombosed vessels, and showing vascular lesions that have tubular, linear, or tortuous bright signal representing regions of rapid blood flow (Fig. 11-8). Gradient echo sequences may be obtained faster than conventional spin-echo techniques, although their increased susceptibility to motion artifact decreases the benefits of a short scan time. Gradient echo techniques also permit volume, that is, three-dimensional versus two-dimensional acquisition of images, allowing increased spatial resolution and computer workstation reconstruction of any imaging plane at various slice thicknesses. The disadvantage of gradient echo sequences is the increased magnetic susceptibility artifact from bone or air, thus limiting their role near the skull base or paranasal sinuses. (To quickly identify a gradient echo image: arteries and often veins are white; fat, CSF, vitreous, and mucosa may have variable signal intensities depending on the technique used.)







Nuclear Medicine



Positron Emission Tomography


As opposed to the imaging modalities already discussed in this chapter, which allow detailed anatomic information, positron emission tomography (PET) imaging provides physiologic and biochemical data. A positron emitting radiopharmaceutical is intravenously injected and its distribution in the body is measured. Positron emitting radiopharmaceuticals can be developed from naturally occurring substances such as 15O water, 11C carbon monoxide, or 13N ammonia, or radioactive analogs of other biologic substances such as 18F fluoro-2-deoxy-2-glucose (FDG). After being emitted from the atom, the positron travels in the tissue for a short distance until it encounters an electron and forms a positronium, which immediately annihilates (converts its mass to energy) forming two 511-keV photons. These annihilation photons travel away from each other at approximately 180 degrees and are picked up by the detectors placed around the patient. Simultaneous detection of these photons relates them to the same annihilation event and allows spatial localization. Annihilation coincidence detection can be accomplished by very expensive dedicated PET scanners, yielding superior spatial resolution and sensitivity. Less costly gamma camera-based hybrid systems allow utilization of PET imaging outside of academic centers.


Attenuation of the photons in tissues they travel through decreases the apparent activity picked up by the detectors. Attenuation correction methods provide improved anatomic detail and better lesion localization but they result in noisier images. The effect of attenuation correction on visual image quality is controversial and, in many centers, the images are generated both with and without attenuation correction. For semiquantitative and quantitative evaluation, however, attenuation correction is necessary.


Depending on the radiopharmaceutical chosen, PET imaging can provide information regarding blood flow, ischemia, deoxyribonucleic acid metabolism, glucose metabolism, protein synthesis, amino acid metabolism, and receptor status. Radiopharmaceutical development requires sophisticated knowledge and equipment which, combined with the very short half-life of most of these substances, limits clinical utility. The relatively long half-life of FDG (110 minutes) accounts for its widespread use. FDG can be delivered to PET imaging facilities through commercial vendors obviating the need for an on-site cyclotron.


Glucose metabolism in growing neoplastic cells is enhanced and accounts for the increased uptake on FDG-PET studies. Molecular studies have revealed that several genetic alterations responsible for tumor development also have direct effects on glycolysis. It has also been shown that increased tumoral FDG uptake is strongly related to the number of viable tumor cells, but not clearly associated with their proliferative rate. The glucose analog 2-deoxy-D-glucose is transported into the cell and metabolized in the glycolytic cycle. After phosphorylation with hexokinase to DG-6-phosphate, the compound is metabolically trapped in the cell. Because of this trapping mechanism, FDG concentration steadily increases in metabolically active cells, yielding a high contrast between tumor and normal tissue. Bear in mind that increased glucose metabolism is not unique to malignant cells and can be seen in benign tumors, inflammatory or infectious lesions, and even normal tissues. Also, some malignant cells may not have increased glucose metabolism for a variety of reasons.


A typical PET scan is started 30 to 60 minutes after the intravenous administration of approximately 10 mCi of 18F-FDG. A 6- to 12-hour period of fasting is required before injection. Patients are encouraged to drink water before the FDG injection to minimize collection in the urinary system. The patients are told not to speak or chew before PET scanning. Because normal FDG uptake in muscle may mimic tumor, muscle relaxants such as benzodiazepines are used in some centers. Scanning is performed in the supine position at multiple table positions to cover the entire body. Scan time is 30 to 60 minutes.


Qualitative evaluation of FDG-PET images is sufficient for most clinical purposes, but quantitative measurement of FDG concentration is possible. Several approaches of different complexity can be applied for this purpose. Some of these require complex computation, data acquisition, and arterial blood sampling during scanning. The most commonly used method, standardized uptake values (SUV), is simple and confined to the measurement of radioactivity concentrations at a single time point. The activity concentration is normalized to the body weight or body surface area. SUV may allow differentiation of malignant tissue from benign causes of increased uptake and can be used to measure the response to treatment. A downside of SUV calculation in therapy monitoring is that it only allows comparison of two measurements obtained at the same time point after tracer injection.


A major disadvantage of PET is lack of anatomic information, resulting in poor lesion localization. A number of software applications are used to “fuse” PET images with CT or MR images, which are obtained at different time points. Fusion of anatomic and functional images significantly improves lesion localization, but it is still subject to many technical difficulties and errors. Combined PET/CT units permit acquisition of both CT and PET images using a single piece of equipment in the same session without the need to move the patient. Errors in lesion localization are minimized, although they do occur in certain body regions where physiologic or involuntary motion is unavoidable.


Another major limitation of PET is poor spatial resolution. Currently, the maximum spatial resolution of dedicated PET scanners is about 5 to 6 mm. It is substantially inferior for more commonly used hybrid scanners. PET is an evolving technology and improvements in spatial resolution will surely be accomplished. Because of fundamental limitations inherent to the method, however, the maximum achievable spatial resolution is 1 to 2 mm. Therefore PET is incapable of showing microscopic disease.



Radionuclide Imaging


Scintigraphy has several applications in the head and neck. In salivary gland imaging, technetium-99m (99mTc)-pertechnetate imaging may be useful for assessing salivary gland function in autoimmune and inflammatory disease of the salivary glands. If the salivary glands are obstructed, the degree of obstruction, as well as the follow-up of obstruction after treatment, can be assessed. In evaluating neoplasms of the salivary glands, the findings of the 99mTc-pertechnetate scan are almost pathognomonic of Warthin’s tumor and oncocytoma. Spatial resolution is limited to approximately 1.5 cm, so accurate localization of the mass within the gland is difficult. Single photon emission computed tomography (SPECT) may be useful in some cases.


Techniques of thyroid imaging and thyroid therapy are described in several textbooks.9,10 Many centers use I-123 to obtain a thyroid update determination, and 99mTc-pertechnetate is used to obtain whole gland images. It is these images that determine whether thyroid nodules are “hot” or “cold.” I-131 is used for therapy of hyperthyroidism and in follow-up to detect and treat residual, recurrent, and metastatic thyroid cancers.


Medullary carcinoma of the thyroid is difficult to visualize, but 99mTc-DmSA has been used. In-111 pentetreotide has been used with some success.


Identification of parathyroid adenomas has been done for several years with a subtraction technique using 99mTc-pertechnetate and Tl-201 (Fig. 11-9). The basis of this test is that thallium is taken up by thyroid tissue and parathyroid tissue. Thyroid tissue is the only tissue that uptakes 99mTc-pertechnetate. Therefore the subtraction of the 99mTc-pertechnetate image from the thallium-201 image should leave only parathyroid tissue. The sensitivity of this technique is believed to be very high for lesions larger than 1 g. Sensitivity decreases for smaller lesions, and the subtraction technique can be hampered by patient motion. 99mTc-sestamibi is now the favored agent in many institutions. A double-phase imaging protocol is employed with improved identification of parathyroid adenomas.



CSF leaks can be detected with In-111 DTPA placed into the subarachnoid space. This technique is described and illustrated in Chapter 44.



Three-Dimensional Reconstruction Techniques


Image data from either CT or MRI can be processed to create three-dimensional reconstructions. The state-of-the-art picture archival and communication systems (PACS) that are now available in most institutions obviate the need for a separate workstation to perform these reconstructions.


CT data are loaded as a stack of contiguous two-dimensional slices that define the scanned volume. Reconstructions are created either by choosing a specific range of densities for display or by manually tracing the outline of the desired structure. Improvements offered by multislice-multirow CT scanners and enhanced computational capacity of imaging workstations have led to a paradigm shift in radiology; volume imaging has replaced axial imaging. CT data from a large body part can be gathered in a very short time as a whole and the obtained “volumetric” data set can be displayed in various planes and as three-dimensional reconstructions.


Magnetic resonance data for image analysis are best acquired using a “volume acquisition” method, in which data are acquired as a complete three-dimensional block rather than as individual slices. Because volume acquisition takes longer, gradient echo techniques are usually required to reduce the imaging time. Once acquired, the data are displayed in any desired plane and, by selecting a range of signal intensities or by tracing specific structures with a cursor, three-dimensional surface models are created.


The utility of three-dimensional reconstruction is best appreciated with craniofacial reconstructions.11,12 Directly visualizing the three-dimensional relationships of the facial structures aids surgical planning. Instructors find three-dimensional models of the face and orbital structures useful for teaching medical students, residents, and anatomy students. Virtual endoscopy is a computer-generated simulation of endoscopic perspective. The virtual endoscopic images of the trachea, larynx, pharynx, nasal cavity, and paranasal sinuses and ear have demonstrated clinical utility (Fig. 11-10).



At present, the spatial resolution of CT is superior to MRI in the head and neck for displaying bony relationships. However, MRI provides a superior display of transcranial soft tissue structures, such as the entire visual pathway, and has better tissue contrast resolution than CT. Thus CT and MRI will likely have complementary roles in three-dimensional image display.



Applications of CT, MRI, and Ultrasound in the Head and Neck


Each anatomic region requires a different imaging approach to optimize the detection and characterization of the structure or lesion of interest. The following is a description of the indications for using CT, MRI, or ultrasound in specific head and neck regions, plus a general imaging approach relevant to each anatomic region in terms of imaging planes, slice thickness, contrast agents, and pulse sequences. Whenever possible, CT and MRI are performed before biopsy or resection of lesions because the resulting edema may obscure the true margins of a mass.



Application of Computed Tomography in Head and Neck Region


Multichannel CT scanners revolutionized head and neck imaging. The entire neck can be scanned in less than a minute at a slice thickness of less than 1 mm. These data can then be reconstructed in any plane with a desired slice thickness. This obviated the need for site specific imaging protocols. A typical neck CT using a multidetector scanner employs 1-mm slice thickness and a pitch of approximately 1, with contiguous axial scanning performed from the sella turcica down to the thoracic inlet. Then, typically 3-mm-thick axial, sagittal, and coronal images are reconstructed for view. The use of intravenous contrast is critical for interpretation of the study. Determination of extent of disease and vascular invasion, compression, and discrimination of vessels from nodes and small muscle bundles can be extremely difficult (see Figs. 11-3 and 11-4). Evaluation of the normal mucosa-submucosa interface and mucosal tumors can not be accomplished without contrast enhancement. Optimally, contrast should be present in both arteries and veins during image acquisition. Also, enough contrast should be allowed to diffuse from vessels to the tissue interstitium for tumors to enhance. This is particularly important for high-end multidetector CT scanners, which tend to finish image acquisition before optimal tumor enhancement is achieved, unless a delay between injection of contrast and scanning is employed. To maintain good opacification of vessels after this delay, a biphasic contrast injection scheme is used. The delay time and the rate at which contrast material is injected vary depending on the specifications of the scanner. Contrast is best administered with a mechanical pump infusion although a drip-infusion technique may also be effective. Frequently, image reconstruction using a soft tissue algorithm suffices. If a suspicion of bone erosion or destruction by tumor or inflammation exists, sections of the skull base and mandible need to be reconstructed using a bone algorithm.





Salivary Glands


Dental amalgam can cause significant streak artifacts that obscure the parotid or submandibular gland parenchyma. If the dental work is identified on the lateral scout view (scanogram), dental artifacts can usually be avoided if an oblique semiaxial projection is chosen with the scanner gantry angled in a negative direction (between a coronal and an axial plane), thus avoiding the teeth. This plane has the advantage of visualizing both parotid and submandibular glands in the same slice and is parallel to the posterior belly of the digastric muscle.13 Contrast administration is required for both neoplastic and inflammatory conditions of the salivary glands. Enhancing intraglandular vessels may mimic or obscure small stones, thus a precontrast scan is also advised in cases of suspected sialolithiasis. The CT attenuation of a normal parotid gland is variable depending on the proportion of fat and glandular tissue present, which varies with age. Submandibular glands have a more predictable attenuation that is similar to that of muscle. Any difference in attenuation values of the right and the left submandibular glands should be suspicious for an obstructing lesion such as floor of the mouth cancer.




Larynx and Infrahyoid Neck


Laryngeal and infrahyoid neck CT is most commonly requested to evaluate squamous cell carcinoma of the larynx or hypopharynx, associated cervical lymph node metastasis, trauma, and inflammation. The fine detail of the larynx and vocal cords requires thinner reconstructions than the routine 3-mm sections. Sections through the vocal cords are optimally reconstructed parallel to the plane of the true vocal cords. Because assessment of vocal cord mobility is important in staging glottic carcinoma, various provocative techniques may facilitate laryngeal imaging in those cases where the vocal cords are obscured on physical examination. Quiet breathing places the cords in a partially abducted position. By having the patient blow through a straw or do a modified Valsalva maneuver (puffing out the cheeks) the hypopharynx and supraglottic larynx can be distended, allowing better separation of the aryepiglottic folds from the hypopharynx, while simultaneously abducting the cords (see Fig. 11-3). The vocal cords can be assessed during phonation (“eeee”), which causes the cords to adduct and move to a paramedian position (see Fig. 11-3). Breath holding will also adduct the vocal cords, close the glottis, and significantly reduce motion artifacts. By scanning the larynx twice, once to adduct and a second time (sections limited to the glottis) to abduct the vocal cords, the radiologist can assess vocal cord motion and identify fixation. Evaluation of laryngeal trauma may not require intravenous contrast. Bone windows are helpful for assessing cartilage fractures or tumor erosion.




Paranasal Sinuses


Paranasal sinus CT can be approached in several ways depending on the anticipated disease process. A screening axial sinus NCCT (Fig. 11-12A) is clearly superior to conventional radiographs and provides information on specific sinus involvement by inflammatory processes as well as better delineation of bony sclerosis or destruction. One method is to use 5-mm-thick sections obtained at 10-mm intervals (5-mm gap), which can cover the entire paranasal sinuses with six to eight slices. The technical parameters can be optimized to achieve a low radiation exposure that is similar to radiographs.



When endoscopic sinus surgery is anticipated, NCCT imaging of the sinuses is mandatory for preoperative evaluation of the extent of sinus disease, to detect anatomic variants, and for planning the surgical approach (see Fig. 11-12B). This study is done in the axial plane while the patient is supine with thin sections reconstructed in sagittal and coronal planes. Frequently, only the bone algorithm with its edge enhancement properties is needed for evaluating the detailed anatomy of the ostiomeatal complex. Contrast administration is usually not necessary for routine sinusitis, although when severe nasal polyposis is suspected, contrast may be useful to demonstrate the characteristic “cascading” appearance of the enhancing polyps or to characterize an associated mucocele. A soft tissue algorithm with soft tissue windows may be useful when using CECT for intracranial complications from sinus inflammatory processes. A nasal decongestant may be used to help decrease normal but asymmetrical nasal mucosa congestion (normal nasal mucosal cycle) from a mucosal based mass.


The assessment of sinus tumors requires the most detailed imaging. The intracranial compartment, parapharyngeal, masticator, and buccal spaces should sufficiently be included in the field-of-view as to allow for assessment of tumor extension. At least the lymph node levels I and II should be visualized. For an optimal study, both soft tissue and bone algorithms are used. This differentiates the soft tissue component, as well as evaluating subtle bony destruction (see Figs. 11-12A and C). The coronal plane is best for evaluating the cribriform plate. CECT is used to maximize the enhancement characteristics of the tumor and differentiate it from adjacent soft tissue structures.





Application of Magnetic Resonance Imaging in Head and Neck Region



Suprahyoid Neck


MRI is ideally suited for imaging the suprahyoid neck (including nasopharynx, oropharynx, oral cavity, and tongue). The standard head coil which has a high signal efficiency will permit visualization of the suprahyoid neck structures caudally down to approximately the level of the inferior margin of the mandible and floor of mouth. For imaging the oral cavity, floor of mouth, submandibular space, and cervical lymph node chains, a head coil will not suffice. Either an anterior or volume neck coil is needed to visualize the entire neck from the skull base to the thoracic inlet (from dura to pleura). Several pulse sequences and imaging planes using 5-mm thick sections are required to adequately assess the deep and superficial structures of the neck. (Implicit in this discussion of MRI technique for all areas of the head and neck is the fact that a sagittal T1WI is obtained as the initial sequence in all of the authors’ studies and is used primarily as a scout view for the proper positioning in other imaging planes, as well as for anatomic information.) A precontrast axial T1WI, and often a coronal T1WI, are required to optimally assess fat planes in the neck. Fat provides an excellent white background from which muscle and fascial planes, bone, sinus, and vascular structures can easily be discriminated. The coronal plane is particularly useful for visualizing the relationships of the suprahyoid neck structures to the skull base and also for delineating the anatomy of the tongue and floor of mouth. A T2WI or STIR, usually obtained in the axial plane, is required to detect structures with a long T2 (e.g., tumors, cysts, lymphadenopathy, edema) that appear brighter than the background muscle and fat. Postgadolinium T1WIs with fat suppression in the axial and coronal plane are frequently helpful to discriminate the enhancing margins of a lesion or to detect perineural spread of tumor. The T2WI may also be combined with fat suppression to optimize the information obtained.







Paranasal Sinuses


Sinus MRI is primarily indicated for evaluating sinus tumors (and occasionally inflammatory disease such as a mucocele) and may be accomplished with a standard head coil, using 3- to 5-mm slices. The principal value of MRI over CT for sinus tumors is the ability of MRI to distinguish between tumor and obstructed sinus secretions and to predict the true extent of the tumor. A precontrast sagittal, axial, or coronal T1WI will provide a good demonstration of the sinuses, nasal cavity, cribriform plate, masticator and parapharyngeal spaces, and orbits. Sinus secretions with low protein content show low T1 and high T2 signal. With increasing protein content the T1 signal increases and T2 signal decreases. Very high concentrations of protein result in low signal on both pulse sequences and may occasionally mimic normally pneumatized sinus. No enhancement of the sinus secretion is seen on postcontrast images except for the mucosa of the sinus whereas tumors often enhance and show intermediate T2 signal allowing differentiation of postobstructive secretions from tumor. Fat saturation T1WI is recommended to better define the sinus tumor margins when the tumor extends directly or by perineural spread beyond the sinus into the anterior cranial fossa, orbit, parapharyngeal space, or pterygopalatine fossa. The sagittal and coronal planes are very helpful for evaluating cribriform plate extension. The coronal and axial planes are best for orbital, cavernous sinus, pterygopalatine fossa, and parapharyngeal space spread.



Temporal Bone


MRI is the imaging standard for the detection of internal auditory canal (IAC), facial nerve canal, and jugular foramen lesions. MRI is useful, in combination with CT, for assessing expansile or destructive lesions of the temporal bone and external auditory canal. A standard head coil is adequate for most temporal bone lesions, but a smaller 5- to 10-cm surface coil may provide improved resolution for certain regions in the expense of coverage. The small size of the temporal bone structures and their respective lesions requires high spatial resolution images, which may be accomplished by using thinner slices of 0.5 to 3.0 mm (preferably without an interslice gap), smaller surface coils (higher signal-to-noise ratio), volume acquisition, or T1WI (higher signal-to-noise ratio). Precontrast T1WI in the sagittal and axial planes is useful for defining anatomy and for detection of high-signal lesions such as fat, methemoglobin, and viscous or proteinaceous cysts. Postgadolinium T1WIs (without or with fat saturation) in the axial and coronal planes are essential for detecting small enhancing lesions and determining the extent of larger lesions. In fact, for routine evaluation of a suspected acoustic schwannoma, only a postgadolinium axial and coronal T1WI study may be required. T2WIs are frequently unnecessary for IAC tumors but may be helpful when brainstem ischemic or demyelinating disease, meningioma, blood products, proteinaceous secretions, or a large destructive tumor is suspected or is being further evaluated after a preliminary temporal bone CT. A facial nerve lesion in the mastoid segment of the facial nerve canal is best evaluated for proximal and distal extension using a surface coil with sagittal and coronal pregadolinium and postgadolinium T1WIs.


Jun 5, 2016 | Posted by in OTOLARYNGOLOGY | Comments Off on Overview of Diagnostic Imaging of the Head and Neck

Full access? Get Clinical Tree

Get Clinical Tree app for offline access