Chapter 31 Genetic Mechanisms of Retinal Disease
The purpose of this chapter is to provide an overview of concepts underlying our current understanding of the genetic basis of inherited retinal diseases (iRDs). iRDs are perhaps the best understood of human hereditary disorders. In part this is because diseases that affect vision are easily recognized and the retina is an accessible and well-characterized tissue. In many ways, though, we are still at an early stage of understanding the causes and consequences of these diseases. In fact, the causes of iRDs are highly varied: many different types of retinal disease are known, many different genes are involved, and there may be dozens of disease-causing mutations reported within a single gene. For example, currently at least 220 genes are known which can cause one or another form of retinal disease,1 and over 5000 mutations have been reported, in total, in these genes.2 In spite of the underlying complexity, it is now possible to identify the disease-causing gene and mutation, or mutations, in a substantial fraction of affected individuals and families.3,4
A useful concept in medical genetics is the distinction between single-gene diseases and multifactorial diseases. Inherited diseases such as retinitis pigmentosa (RP) are considered to be single-gene because there is a specific, underlying cause in each affected individual, that is, an inherited difference in DNA sequence that has a direct cause-and-effect relationship to the disease. There may be one DNA difference for dominant diseases, or two for recessive diseases, but only one gene is involved. These are also referred to as monogenic or Mendelian diseases. In contrast, for diseases such as age-related macular degeneration (AMD), genetic differences play a role in lifetime risk and/or clinical expression, but the differences are merely contributory and do not have a clear cause-and-effect relationship to the disease. These are “multifactorial” diseases because multiple factors, genetic, environmental, and stochastic, play a role in determining who is affected and who is not.
Therefore the cause of disease in an individual with an inherited condition such as RP is “simple,” in the sense that only one gene is affected (and usually affected in an obvious way), whereas there may be multiple contributory factors in an individual with AMD and the differences may be subtle. We already known exceptions to this rule – for example, there are digenic forms of RP with two affected genes5 – but the exceptions are rare.
This chapter focuses on genetic differences that are single-gene in nature and have a direct cause-and-effect relationship with disease, that is, inherited diseases of the retina. Genetic factors contributing to AMD are discussed in Chapter 64 (AMD: Etiology, genetics, and pathogenesis).
Basic concepts in human genetics
Figure 31.1 shows pedigrees illustrating autosomal dominant, autosomal recessive, and X-linked recessive inheritance (see Nussbaum et al.6 for details).
Fig. 31.1 Pedigrees illustrating autosomal dominant, autosomal recessive, and X-linked recessive inheritance.
iRDs follow textbook patterns of Mendelian inheritance: autosomal dominant, autosomal recessive or X-linked. However, real families are often more complicated, especially for late-onset, progressive forms of retinal disease. This section reviews the conventional modes of inheritance and possible complexities.
Autosomal dominant inheritance
Autosomal dominant inheritance occurs when a single copy of a mutation on an autosomal chromosome is sufficient to cause disease. That is, an affected individual is heterozygous for the mutation. Diseases caused by dominant mutations pass from generation to generation, i.e., most families have affected individuals in multiple generations. Males are as likely to be affected as females and approximately 50% of children of an affected individual will be affected. Forms of retinal disease that are often autosomal dominant include maculopathies such as Best disease.
Two phenomena that can confuse the picture of autosomal dominant disease are variable expression and incomplete penetrance.
Variable clinical expression means that individuals with the same mutation may vary in onset, progression, or severity of disease or, in some cases, may have distinctly different clinical findings. Autosomal dominant RP is notoriously variable in expression. For example, mutations in one autosomal gene, PRPH2 (also known as RDS), can cause dominant RP, dominant macular degeneration, or dominant panretinal maculopathy, even among members of the same family.7–11
Variable expression is a problem in determining mode of inheritance because some individuals may not show symptoms until late in life, and individuals with different symptoms may be diagnosed with different diseases.
Incomplete penetrance, or nonpenetrance, means that some individuals with a disease-causing mutation will not be affected. For instance, 20% of individuals with a dominant-acting mutation in PRPF31 will have normal vision by age 60 even though relatives with the same mutation may have RP by age 20.12–15 One indicator of nonpenetrance in a multigenerational family is a “skipped generation,” that is, an unaffected individual with an affected parent and an affected child. This is often seen in families with PRPF31 mutations.
Although variable expression and incomplete penetrance are seen as distinct phenomena, they are actually part of a continuum, with nonpenetrance just the extreme. The difference between late onset and no onset may simply be the age of the patient when examined. Whatever the terminology, the underlying finding is that dominant retinal disease mutations may have highly variable consequences, confounding diagnosis.
Autosomal recessive inheritance
Autosomal recessive inheritance occurs when both copies of an autosomal gene must be affected to cause disease. An affected individual can be either homozygous for a single mutation or heterozygous for two distinct mutations. An individual with two distinct recessive mutations is also called a compound heterozygote. Note that a pair of recessive mutations must be on opposite chromosomes. If two variants are in the same gene on the same chromosome, they are in cis to each other; if they are on opposite chromosomes they are in trans. Recessive mutations must be in trans.
Examples of autosomal recessive retinopathies include Leber congenital amaurosis and Usher syndrome.
Unless one of the two mutations in a recessive case is a new mutation, the parents must be carriers of the mutation or mutations, that is, they must be heterozygous. Carriers are usually not affected. Approximately one-fourth of children of carrier parents are affected and one-half of children are carriers. Many recessive cases are isolated or simplex cases, i.e., one affected family member only. Families with multiple affected sibs are “multiplex.”
Finally, in consanguineous families with marriage between relatives, an identical recessive mutation may be passed to multiple family members. Affected individuals may occur in more than one generation and in more than one branch of these families. Two identical mutations that derive from a recent ancestor are identical by descent (IBD). Marriage between relatives is more common in some cultures than others, hence IBD inheritance of retinal diseases is more frequent in those societies.
Because carriers are not self-evident, the mode of inheritance is often hard to assign in recessive families.
X-linked or sex-linked inheritance
X-linked or sex-linked inheritance is a single mutation on the X chromosome which causes disease. Males, who are hemizygous for the X chromosome, are always affected, often severely affected. For many inherited diseases, female carriers of an X-linked mutation are not affected. Since females have two Xs, this implies that most X-linked mutations will be recessive in females. For a truly recessive X-linked mutation, one-half of the sons of a carrier female are affected, one-half of her daughters are unaffected carriers, and none of the sons of an affected male are affected. This produces a notable pattern of inheritance, with the salient feature that male-to-male transmission of an X-linked mutation is not possible.
The disease status of female carriers is more complex, though. Although females have two Xs, one of the Xs, selected at random in each cell, is inactivated in most tissues. This is X-inactivation or lyonization, named for Mary Lyon, who first described the phenomenon.16,17 Lyonization increases the likelihood that a female carrier will be affected since some cells will express only the mutant protein. In fact, many female carriers of X-linked RP mutations show clinical symptoms. Females are less severely affected than males with the same mutation, but female carriers of X-linked RP mutations may have significant loss of vision by midlife or earlier.18–22
One consequence of clinical disease in carrier females is that families with X-linked RP may appear to have autosomal dominant RP if several females are affected.23 This is an example of complexities that arise in determining the mode of inheritance of iRDs.
Isolated cases deserve an entry of their own because the mode of inheritance is often obscure. A practical definition of an isolated case is an affected individual with no affected first-degree relatives (parents, sibs, and children) and no reports of more distant affected relatives. One immediate concern is that there may be other affected family members but the person describing the family is unaware or uninformed. Clinical examination of first-degree relatives is often revealing.
Assuming a case is genuinely isolated, there are several possibilities. The most likely prospect is that this is an autosomal recessive case and the parents are carriers. Or, perhaps one parent is a carrier but the other mutation is de novo (new). Alternatively, this may be a new dominant-acting or X-linked mutation. Another possibility is autosomal dominant or X-linked inheritance with nonpenetrance in prior generations. Ultimately, for most isolated cases the mode of inheritance will have to be determined at a molecular level by genetic testing.
Digenic and polygenic inheritance
Nearly all iRDs are monogenic, with only one gene affected per person. This is based on empirical observation, but it may be misleading since more complex forms of inheritance are hard to prove. Two counter examples are known for iRDs. First, one form of RP is caused by a combination of one mutation in the PRPH2 (RDS) gene and another mutation in the ROM1 gene.5,22 These two mutations are benign alone but pathogenic in combination. This is digenic inheritance. Secondly, Bardet–Biedl syndrome (BBS), a form of RP combined with congenital abnormalities, is in most instances a recessive disease with mutations in any one of at least 15 known BBS genes.1,24 Some cases of BBS, though, require a third mutation in a second BBS gene for disease expression.25,26 This is called trigenic or triallelic inheritance. Whether these examples of polygenic inheritance of iRDs are just rare anomalies or hint at greater complexity of retinal diseases is unclear.
Chromosomes are dark-staining bodies seen in the nucleus of dividing eukaryotic cells. In diploid organisms, such as humans, the earliest diploid cell before division results from fusion of a haploid cell from the male parent and a haploid cell from the female parent. That is, the first human cell has a diploid count of 23 pairs of chromosomes (n), or 46 total chromosomes (2n), and derives from fusion of a haploid sperm and haploid ovum. This is the primary germline cell and contains the germline genetic information in the nucleus. All subsequent cells, known as somatic cells, contain a nearly perfect copy of the original chromosomes and genetic information. Exceptions in humans are sperm- and ovum-producing cells (also known as germline cells) which produce haploid cells, and certain blood cells that do not contain a nucleus.
Eukaryotic chromosomes have been referred to as “information-carrying organelles” because they are highly structured, highly compressed complexes of proteins, RNAs, DNA, and other factors, with the primary function of transmitting genetic information from one generation to the next, or from a parent cell to a daughter cell. However, at the heart of each chromosome is a single, double-stranded DNA molecule. DNA length is measured in basepairs (bp): each single strand of DNA is composed of nucleotide bases, and each base interacts (pairs) with an alternate base in double-stranded DNA, so bp are the natural units. DNA is also measured in kilobases (kb), megabases (Mb), and gigabases (Gb). The DNA molecule within a chromosome may be hundreds of Mb in length. This is, by far, the largest single biomolecule known. One reason for the chromosomal superstructure may simply be to keep this giant molecule intact. However, chromosomes also participate directly in DNA duplication and expression.
DNA, RNA, and proteins
Figure 31.2 shows the steps in DNA duplication, RNA translation, and protein synthesis.27
Fig. 31.2 Steps in DNA duplication, RNA translation, and protein synthesis.
(Reproduced from Nussbaum RL, McInness RR, Willard HF. Thompson and Thompson’s genetics in medicine, 7th ed. Philadelphia, PA: Saunders Elsevier; 2007, p. 31, with permission from Elsevier.)
DNA is deoxyribonucleic acid, a linear molecule composed of four monomers: adenine (A), thymine (T), guanine (G), and cytosine (C). Two antiparallel DNA strands pair through hydrogen bonds to form a double-stranded molecule which carries genetic information.
RNA is ribonucleic acid, a linear molecule, like DNA, composed of adenine, uracil (U), guanine, and cytosine. RNA is single-stranded in most circumstance but it can form complex folded shapes by pair bonding within the linear strand. Messenger RNA (mRNA) transfers genetic information within cells, but other RNA molecules play diverse roles in several biological processes.
Proteins, composed of various combinations of 20 amino acids, are linear molecules which can fold into many shapes, and which play essential and highly diverse roles in all biological processes.
DNA function is called the central dogma of DNA in recognition of the landmark explanation of DNA structure and function by Watson and Crick in 1953, and subsequent unraveling of the genetic code over the next decade.28,29 DNA is comprised of a phosphate backbone with nucleotide bases, A, T, G, or C, in linear array along the backbone. The backbone is conventionally drawn from the 5’ phosphate on one end to the 3’ phosphate on the other end. The opposite strand forms by pairing of cognate bases, A to T and G to C, on the parent strand. The opposite strand naturally aligns in a helical, antiparallel fashion, from 3’ to 5’ phosphates. This arrangement essentially explains inheritance in all living things.
In DNA duplication, the two antiparallel strands unwind, and a nearly exact antiparallel copy is synthesized on each single strand. The principal enzyme involved is DNA polymerase, but additional enzymes are involved in unwinding, patching, and repairing the DNA. DNA duplication occurs in the nucleus of cells only.
In DNA-RNA transcription, the DNA strands unwind, and a single-stranded RNA molecule is synthesized as a short antiparallel copy of one of the DNA strands, pairing each DNA nucleotide with the corresponding RNA nucleotide. The primary enzyme involved is RNA polymerase, and the first steps occur in the nucleus. Thereafter the RNA molecule is processed through many steps, and eventually exported from the nucleus to the protein-forming machinery. The final molecule is mRNA since it carries the DNA message to the cytoplasm.
In protein translation, mRNA is read by the protein-forming machinery and the corresponding protein is built by adding one amino acid to the next in succession, from the amino (NH2-) end of the protein to the carboxy terminus (-COOH). Each amino acid is coded for by three RNA bases, that is, a nucleotide triplet or codon. After synthesis, most proteins are further modified through posttranslational modification, then the linear protein folds into its active shape, often with the assistance of proteins known as chaperones.
Figure 31.3 shows gene structure based on the relationship between the protein sequence, mRNA intermediate, and original DNA gene sequence.27
Fig. 31.3 (A, B) Gene structure based on the relationship between the protein sequence, mRNA intermediate, and original DNA gene sequence.
(B, reproduced from Nussbaum RL, McInness RR, Willard HF. Thompson and Thompson’s genetics in medicine, 7th ed. Saunders Elsevier; 2007, p. 29, with permission from Elsevier.)
The modern concept of a gene is clouded by arguments as to where a gene starts and stops, and whether segments of DNA that do not code for proteins but still influence traits are “genes.” This discussion is limited to defining a gene in terms of proteins, while acknowledging the broader complexities.
Gene expression is principally the steps from DNA transcription to protein translation. Gene expression starts with separation of double-stranded DNA, exposing a single-stranded sequence on which DNA-to-RNA transcription can occur. This is accompanied by binding of a complex set of proteins, “expression factors,” which facilitate binding and activity of RNA polymerase.
The primary RNA strand begins at the start of transcription and ends far beyond the length sufficient to code for a protein. The first RNA-processing steps add a methyl cap to the first RNA nucleotide, trim the 3’ end, and add a polyadenosine tail (poly-A tail). Next the RNA moves to a complex assembly of proteins and small, functional RNAs, known as a splicesome. The splicesome then removes anywhere from one to many internal segments of the RNA transcript and reassembles the remainder. This is, largely, the finished mRNA, which is then exported from the nucleus to the protein synthesis machinery in the cytoplasm.
RNA splicing has profound consequences for gene structure and protein variation. Splicing occurs in nearly all eukaryotes and almost all human genes are spliced. The spliced-out segments are called introns and the remaining, reassembled segments are exons. The splice sites are defined by short, canonical sequences, highly conserved across species, known respectively as splice donor and splice acceptor sites.
The evolutionary significance of splicing is still disputed, but its functional consequence is clear: it vastly increases the number of distinct proteins. This is because when splicing occurs, alternate combinations of introns may be removed. Alternate splicing is the norm in human genes, not the exception, and usually results in alternate mRNAs and alternate protein isoforms – all from a “single” gene. There are many examples of alternately spliced retinal genes producing multiple protein isoforms.30,31
Following splicing and export from the nucleus, mRNA is translated into protein by ribosomes in the endoplasmic reticulum. The start of translation is usually not at the beginning of the mRNA, and the end is not at the end. The segment upstream of the start of translation is the 5’ untranslated region (5’-UTR). Similarly, the segment downstream of the end of translation is the 3’ untranslated region (3’-UTR). The 5’ and 3’-UTRs may sit within the first and last exons, or may stretch across exons.
In addition to alternate splicing and alternate protein isoforms, there are alternate starts of transcription, alternate starts of translation, alternate ends of translation, and alternate poly-A sites.