Dec 8, 2010
One of the promises of the Human Genome Project, a 13-year effort completed in 2003, was that we would be able to identify genes and mutations in the genome that are related to clinical conditions [1,2]. In doing so we hoped to accomplish several tasks: 1) unravel the etiology of particular diseases and traits, 2) understand the functional role of the human genome, and 3) develop therapeutics and cures for diseases. A major challenge in tackling all three tasks has been the high price of DNA sequencing technology. However, the rapidly dropping cost of DNA sequencing will allow us to perform sequencing and collect data on an exponentially greater number of genomes. This data, integrated with our understanding of the genetics of complex diseases, promises to accomplish the above tasks more quickly than we imagined.
Genome-Wide Association Studies (GWAS), an approach to compare the relationship between mutations common in the general population and disease, have paved the way for the identification of disease-causing genes . Crohn’s Disease is a model condition for demonstrating the perfect paradigm for GWAS success. Mutations in the gene product known as interleukin 23 receptor (IL23R) represents one of the first successful attempts to discover genes involved in disease [4,5,6]. GWAS findings related to IL23R have led directly to novel therapeutic drug candidates developed by Allostero Pharm  and Merck. Both companies are evaluating small molecule compounds that target IL23R for the treatment of Crohn’s disease and psoriasis [8,9]. It is quite likely that these drugs may not be a complete cure to these diseases; however, this example highlights the potential of GWAS to identify new therapeutic targets.
Until now geneticists have only been able to compare mutations that are common in the general population. It would be preferential to sequence whole genomes of interest for each individual in a diverse population but until as recently as three years ago the $100,000 price tag per genome made this information largely unaffordable. Today, sequencing costs have been reduced to around $5,000 per genome. Current and future advances in DNA sequencing technologies along with DNA sequencing of large samples across many diverse populations will give us the full picture of rare and common mutations in the human genome. The 1000 Genomes Pilot Project , the effort of an international public-private consortium led by scientists from the Sanger Institute (Cambridge, UK) and the Broad Institute of MIT and Harvard (Cambridge, MA), while having only sequenced 100 samples, or 10% of the total goal, has added 15 million mutations to the public database, approximately 300% more than the previous existing collection of mutations known worldwide. Efforts like this will bring us progressively closer to identifying genetic markers and, possibly, therapeutic targets for almost any disease imaginable.
In 2003, researchers from the Whitehead Institute Center for Genomics Research, now the Broad Institute of MIT and Harvard, announced the completion of the Human Genome Project. The total cost of the project was estimated to be about three billion dollars for 4.5X coverage (coverage or depth represents the average number of times a nucleotide is represented by a high-quality base in a collection of random raw sequence) at approximately $30,000 per one million high quality bases. Today, the cost of 30X coverage of high quality sequence of an individual human genome is quickly dropping into the range of thousands of dollars. This represents a 30,000-fold decrease in cost over the past decade for high quality sequencing of the human genome. It is estimated that the current cost of sequencing one million bases is approximately one dollar. Next-generation sequencing technologies currently available in the market include the HiSEQ sequencing platform from Illumina and the 4hq from SOLiD ABI, with are estimated to generate 200 to 300 giga-bases per run, equivalent to approximately 50 to 100X coverage of the human genome in one run in less than three days. At that output it would take less than half a year to completely sequence 1000 Genomes. In the 1960’s, Gordon Moore, co-founder of Intel, formulated what is today known as Moore’s Law stating that the number of transistors on an integrated circuit will double every two years, a rate that has had profound implications for the size, speed, and price of processors. With DNA sequencing the cost has decreased almost 100,000 fold. Nothing in the history of the world has decreased in cost so rapidly over such a short amount of time.
By 2020 the complete sequencing cost for a human genome could be in the hundreds of dollars, if not lower. The commodification of DNA sequencing technology will make it extremely affordable for the health care sector to adopt sequencing and analysis as a standard of care screening practice, for the forensics sector to adopt it as a common place crime evidence technology and for countless other everyday uses. In addition, low cost DNA sequencing will allow emerging and developing economies like China to play a major role in medical genetics. To date, China has invested approximately two billion dollars towards genomics and sequencing research. The Beijing Genomics Institute of China this past year used a one and a half billion dollar loan from their government to purchase approximately 128 HiSEQ machines, moving the country to the forefront of DNA data generation in the world. Affordable DNA sequencing will give almost any population across the world the ability to map physical traits, clinical traits and historical origins of DNA. Complete information of genomic DNA will be adopted in ways that are as yet unforeseen.
In tandem with the dropping price of next-generation sequencing, the research community expects to see a new paradigm emerge over the next number of years in medical genetics. As researchers use affordable sequencing to probe regions previously identified via GWAS they should be able to discover risk- or protection-conferring independent mutations for a broad range of human diseases. Additionally, we will be able to directly assess the functional relevance of those mutations and genes to disease. Finally, we will be able to integrate this information to discover therapeutic targets and hopefully cures for disease. More stories like that of IL23R will emerge not only for Crohn’s disease but also for devastating and complex diseases such autism, schizophrenia and type II diabetes.
1. Lander, E.S. , et al. Initial sequencing and analysis of the human genome. Nature (2001).
2. Venter, J.C., et al. The sequence of the human genome. Science (2001).
3. The 1000 Genomes Project Consortium. A map of human genome variation from population scale sequencing. 467(7319):1061-73. Nature (2010).
4. Altshuler, D., Daly, M. J., and Lander, E. S. 2008. Genetic mapping in human disease. Science. 322, 5903, 881-888.
5. Duerr, R. H., K. D. Taylor, et al. (2006). “A genome-wide association study identifies IL23R as an inflammatory bowel disease gene.” Science 314(5804): 1461-1463.
6. Hugot, J. P., M. Chamaillard, et al. (2001). “Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease.” Nature 411(6837): 599-603.
7. Allostera Pharm. http://inoviacapital.com/2010/08/allostera-pharma-inc/
8. Ghoreschi, K., et al. Generation of pathogenic T(H)17 cells in the absence of TGF-β
signalling. Nature. 467 (7318):967-71 (2010).
9. Tonel, G. et al. Cutting edge: A critical functional role for IL-23 in psoriasis. J Immunol.
2010 Nov 15;185(10):5688-91.