Genetic Epidemiology: Successes and Challenges of Genome-Wide Association Studies Using the Example of Age-Related Macular Degeneration




With increasing evidence of importance of genetic factors and gene-environment interactions in the etiology of common eye-related disorders, identification of genetic variation associated with a disease risk may help provide insight into the mechanism of disease pathogenesis and reveal novel targets for preventive and therapeutic interventions. Family-based studies have identified genomic regions associated with highly penetrant genes related to rare familial forms of eye diseases. However, late-onset complex phenotypes, such as age-related macular degeneration (AMD), appear to be polygenic, with the involvement of multiple genes, with varying levels of effect.


It has been estimated that about 90% of sequence variants in humans are differences in single bases of DNA, called single nucleotide polymorphisms (SNPs). According to dbSNP (available at www.ncbi.nlm.nih.gov/projects/SNP ), more than 14 million uniquely mapped SNPs have been identified and assembled into a genome-wide database. The HapMap effort (available at hapmap.org ) allowed reducing the number of SNPs required for the examination of the entire genome to roughly a million representative SNPs, also called tagging SNPs, making genome-wide approaches to associate genes with risk of diseases more efficient and less costly.


During the past several years, genome-wide association studies (GWAS) have been designed to assess associations between traits and a large number of DNA sequence variants distributed across the genome, and to detect novel disease-associated pathways using an unbiased hypothesis-free approach. Based on the common variant-common disease hypothesis of disease pathogenesis, GWAS are aimed at identifying common SNPs with allele frequency of >5% that may only modestly increase the disease risk. Currently, 606 GWAS have been reported and catalogued at www.genome.gov/gwastudies . More than 250 genetic loci in which common variants were reproducibly associated with a number of polygenic traits have been identified during the last 2 years only. However, despite significant successes achieved by GWAS, their inconsistency is a generally recognized limitation.


Challenges for Genome-Wide Association Studies


Population Stratification


The most common factor that can bias the results of genetic association studies refers to differences in frequency of candidate genetic variants in the study cohort due to subpopulations with different ancestry. In the presence of population stratification, the significant associations could be exclusively attributable to the different prevalence of genetic variants in various ethnic and racial subgroups, whereas the real disease-causing locus might be missed. This issue can be addressed by matching cases and controls for individuals’ origin or including ancestry-informative markers in the analysis. For GWAS this issue has been well addressed by tools such as EIGENSTRAT software ( genepath.med.harvard.edu/∼reich/EIGENSTRAT.htm ) and multi-dimensional scaling analysis ( pngu.mgh.harvard.edu/∼purcell/plink ).


Multiple Hypothesis Testing


In traditional statistics, with just 1 test performed at the P -value level of .05, there is a 5% chance of incorrectly rejecting the null hypothesis. However, with 100 independent tests, the expected number of such rejections is 5 (5% of 100). The number of false positives may increase to 50 000 when a million-SNP chip array is considered. In order to minimize the false-positive error rate associated with carrying out multiple statistical comparisons, a recommended genome-wide statistical significance level is P < 5 × 10 −8 . This level of significance can be achieved when the genetic effect is considerable (eg, large odds ratio [OR]) or when a sample size for the study is large.


Sample Size/Power


It has been reported that most of the common variants found in the recent GWAS are associated with ORs less than 1.5, with the mean OR of 1.36. This effect size translates to sample sizes of about 4000 cases and 4000 controls required to detect genetic associations with 80% statistical power if a minor allele frequency (MAF) is 10% and almost 7400 cases and 7400 controls if the MAF is 5%. Therefore, small studies are often underpowered to detect small to modest effects and may result in misleading conclusions.


Functional Relevance


Significant associations are often detected with genetic variants that are located in noncoding genetic regions as well as outside of known genes. Such results create a challenge in identifying a causal pathway. While some of the detected associations can be false positives, SNPs not resulting in amino acid change can still affect alternative splicing, gene expression, or protein folding. Also, SNPs found to be associated with the phenotype of interest may not be causal but may be surrogates for causal SNPs, as the result of correlation attributable to linkage disequilibrium.


Rare Variants


SNPs in the coding or regulatory regions of genes are likely to cause functional differences. Nevertheless, many functional variations may not be found by GWAS because of their rare frequencies, which are not well tagged by commercial genotyping panels. Identification of rare variants with impact on disease susceptibility can be accomplished by follow-up resequencing of the refined genetic regions.


Replication


Given the high rate of false-positive findings attributable to phenotypic heterogeneity, population stratification, multiple hypothesis testing, and other factors, replication of association findings in independent cohorts followed by functional studies is a gold standard in the GWAS field.




Genome-Wide Association Studies of Age-Related Macular Degeneration


One of the earliest GWAS examined AMD in a screen of 96 cases and 50 controls genotyped for 116 204 SNPs in the Age-Related Eye Disease Study. Despite the small cohort size, a common variant in the complement factor H gene ( CFH ) was found to be strongly associated with AMD. In individuals carrying 2 copies of the risk allele, the likelihood of AMD was increased 7.4-fold. Resequencing revealed a polymorphism in linkage disequilibrium with the risk allele representing an amino acid change in a region of CFH that binds heparin and C-reactive protein. Other groups discovered similar findings and later replicated findings and reported new CFH variants. Additional AMD-susceptibility loci were found in the ARMS2/HTRA1 region as well as other complement pathway genes.


Recent GWAS with more statistical power, examining much larger numbers of individuals as well as replication cohorts, have identified additional pathways associated with the disease. Hepatic lipase-C, LIPC , was discovered to affect AMD susceptibility. The allele that raises high-density lipoprotein (HDL) cholesterol reduced risk of AMD. This association was corroborated by another GWAS. Furthermore, both GWAS implicated a number of other genes in the HDL pathway. Those genes did not all have the same direction of effect on HDL level and risk of AMD. Therefore, mechanisms other than a direct effect of serum HDL metabolism could be involved in the pathogenesis of AMD. Also, a susceptibility locus near the TIPM3 gene, a metalloproteinase involved in degradation of the extracellular matrix, previously discovered to be associated with Sorsby’s fundus dystrophy, an early-onset maculopathy, was implicated and corroborated in these recent GWAS.


While the GWAS design has borne much fruit for AMD research, many diseases and traits including other eye diseases, such as glaucoma and diabetic retinopathy, to date have not been as successful. It has been suggested that rare variants and structural variation may help find “missing heritability”—unexplained portion of phenotypic variance attributable to genetic factors. The targeted resequencing of pathways detected by GWAS signals as well as whole-genome sequencing in people with extreme phenotypes from diverse ethnic groups will help detect new functional variants in novel and previously identified loci and define their associations with a disease. Large studies will help evaluate the role of gene-gene, gene-environment, and gene-treatment interactions and examine their contributions to disease risk and progression, as well as prevention and treatment.


Publication of this article was supported by Funding/support from the National Institutes of Health , Bethesda, Maryland (Grant RO1-EY11309 ); Massachusetts Lions Eye Research Fund, Inc , New Bedford, Massachusetts; and the Macular Degeneration Research Fund of the Ophthalmic Epidemiology and Genetics Service , New England Eye Center , Tufts Medical Center , Tufts University School of Medicine , Boston, Massachusetts. Johanna M. Seddon has received funding support from Pfizer and Genentech . Tufts Medical Center (J.M.S.) has filed a patent application for some materials related to this work. Both authors were involved in design and conduct of the study; collection and interpretation of the data; and preparation, review, or approval of the manuscript.

Only gold members can continue reading. Log In or Register to continue

Jan 17, 2017 | Posted by in OPHTHALMOLOGY | Comments Off on Genetic Epidemiology: Successes and Challenges of Genome-Wide Association Studies Using the Example of Age-Related Macular Degeneration
Premium Wordpress Themes by UFO Themes