Where Einstein Meets Edison

Genomic Sequencing for Low-Cost Diagnostics

Genomic Sequencing for Low-Cost Diagnostics

Mar 28, 2010

Just as he rolls out of bed and before brushing his teeth, MIT PhD student Lawrence David uses a cotton swab to save a sample of the bacteria in his mouth.  When he gets to his lab, he stores this sample alongside the other saliva and feces samples he has been collecting for the last eight months.  Later in the day, he will collect stool samples.  David and his advisor, MIT evolutionary microbiologist Eric Alm, are studying the human gut microbiome (the bacteria in one’s digestive tract) and using themselves as the subjects.

While the human body is composed of approximately ten trillion cells, typical humans have one hundred trillion bacteria within their gut1,2. David’s lab in MIT’s biological engineering department develops computational and experimental methods for studying microbial evolution, focusing on the human gut microbiome.  Post-doctoral researcher Arne Materna has recently invented a low-cost method for identifying bacteria and sequencing their genomes.  David and Alm intend to apply these new techniques to study the relationship between life events and changes in an individual’s gut biome.  The long-term goal is to learn how to manipulate an individual’s gut biome to prevent disease and promote wellness.  In the short-term, the Alm lab is focused on understanding the temporal dynamics of microbial communities in the gut and mouth and learning how health and behavior affect these microbes.

In medical settings, bacteria are traditionally identified through a culture and sensitivity test.  Usually performed in hospitals, this test involves placing pathogens on cultures for identification and determination of antibiotic resistance.  Technicians use a decision-tree method, using different cultures, and observing the cultures on which the samples grow; This method can be as cheap as $10 per test, but can take days, or even weeks, to deliver results.  Dramatically reducing the cost and time to identify bacteria and their antibiotic resistance characteristics will enable more accurate medical diagnoses and cheaper, better-targeted treatment of several common diseases.

This method for high-throughput gene sequencing of microbes developed by Materna and David has two primary applications:

First, a large number of bacteria can be identified through reading only 16S rRNA genes and matching the pattern to known bacterial sequences.  This method uses a sequencing technology known as Solexa sequencing, and provides a high-level picture of the sample.  While the identity of the bacteria in the sample can be determined, this method would not provide information on antibiotic resistance, as this requires detailed analysis of each cell’s genome.  Using this method, approximately 100,000 unique bacteria can be identified in a week’s time, at a cost of about $50 per sample.

Second, high-throughput sequencing can be used to sequence the entire genome of a single bacterial pathogen.  Although this technique is more time-consuming than the Solexa technique, it determines the antibiotic resistance of a single bacterium.  This method uses a sequencing technique known as 454 sequencing, which takes takes about a day to complete and costs about $150 to evaluate a single bacterial pathogen.

The cost per test is expected to decrease as computational techniques improve and volumes continue to increase.  The second method will be extremely useful when the cost to sequence an entire human genome dips below $1000, as this would imply a cost of $1 to sequence a microbial genome.  The cost for genome sequencing has decreased dramatically over time. The current cost of gene sequencing is about $5,000, compared to $2.3 billion for the first human genome sequencing3.

The advent of inexpensive gene sequencing has created several exciting applications, including:

  • Diagnosing tuberculosis infections: The current gold standard test for tuberculosis (TB) takes two weeks to identify the strain(s) of bacteria that have infected a patient, and another four weeks to determine resistance.  Because of this long timeframe, the standard treatment is to administer TB antibiotics for all of the most common strains.  Depending on the patient’s response to this treatment, antibiotics for less common strains may be subsequently administered.  This process is resource-intensive and accelerates the rate at which the bacteria evolve new antibiotic resistance.  Fast genomic sequencing would allow medical personnel to identify the antibiotic resistance of a sample and use only the appropriate antibiotics for treatment.  Additionally, it would reduce the time to quarantine, which would have important public health implications.
  • Identifying and preventing nosocomial infections: In the United States, an estimated ten percent of hospital patients (2 million patients each year) acquire a nosocomial infection4. Because the bacteria causing these infections can take weeks to identify, it is usually impossible to trace the source of the infection.  With fast genomic sequencing, immediate feedback would enable the hospital to identify how and where an infection occurred and modify their procedures to prevent future occurrences.  Although hospitals can presently treat these infections with antibiotics, Medicare will only reimburse the costs of these treatments if the infecting bacteria is identified.  Reimbursement losses are roughly $4,500 per unclassified infection.  Fast genomic sequencing will enable hospitals to identify the origins of nosocomial infections and recover the cost of treatment.
  • Optimizing antibiotic feeds for livestock: Livestock are commonly given sub-therapeutic antibiotics to increase their body mass and reduce their risk of infection.  Research has shown that livestock gain more weight when their gut flora are perturbed by antibiotics; the reasons for this are still unclear.  With accurate method for assessing microbial communities in livestock, farmers could fine-tune the antibiotic regimens administered to livestock, or potentially even determine which bacteria are responsible for desired features in their animals.  The latter insight could eventually lead to probiotic5 formulations for livestock rearing. Presently, farmers use prebiotic6 feeds, and they measure the effectiveness of their practices using crude metrics such as animal weight and death rate.  Fast genomic sequencing will provide near-instantaneous information that would help farmers refine their practices.
  • Diagnosing diseases of the gut: Inflammatory bowel disease (IBD) and irritable bowel syndrome (IBS) share several common symptoms but are treated completely differently. IBD is diagnosed in part with a colonoscopy, which is expensive, invasive, and time-consuming.  If microbial species consistent with IBD are identified, colonoscopy could be replaced by a sequencing-based analysis of patients’ fecal samples.

The costs of sequencing still remain the largest barrier to the widespread commercial use of these techniques.  However, as computing algorithms and error correction methods continue to improve, the capabilities and reach of inexpensive and fast genomic sequencing will dramatically increase.

David and his advisor Eric Alm have been collecting their gut flora samples as raw data for their group’s genomic sequencing technique.  Furthermore, they have documented their daily eating, exercise, and sleep routines, as well as major events like foreign travel.  The eventual goal is to compare the state of their gut flora (based on analysis of the samples) against their daily activities.  Using statistics and computational techniques, they seek to examine the causal relationships between external factors and one’s gut flora.

Presently, 23andMe provides personalized genomics services derived from an individual’s DNA.  The company, based in Mountain View, California, analyzes the DNA in a customer’s saliva and uses the genetic information to help the customer understand their potential reactions to certain medications, their susceptibility to various diseases, and the presence of inheritable disease markers. This information is static for any given patient, as a person’s DNA does not change during their lifetime.

By contrast, a person’s gut microbiome does change over time. The ability to quickly and inexpensively monitor its composition could represent an extension of the services provided by 23andMe.  In order for cheap and fast genomic sequencing to become useful in such an application, more research needs to be performed on the human gut biome.  Scientists must first determine what constitutes a “normal” gut biome, the length of time over which natural variation cycles occur, and the physiological implications of an aberrant gut biome.  To this end, David hopes his samples, which are now just piles of untapped data, will eventually yield a wide range of useful insights.

For more information, see http://www.the-scientist.com/2009/07/1/17/1/


  1. Björkstén B, Sepp E, Julge K, Voor T, Mikelsaar M (October 2001). “Allergy development and the intestinal microflora during the first year of life”. J. Allergy Clin. Immunol. 108 (4): 516–20. doi:10.1067/mai.2001.118130. http://linkinghub.elsevier.com/retrieve/pii/S0091-6749(01)96140-8.
  2. Guarner F, Malagelada JR (February 2003). “Gut flora in health and disease”. Lancet 361 (9356): 512–9. doi:10.1016/S0140-6736(03)12489-0. PMID 12583961. http://linkinghub.elsevier.com/retrieve/pii/S0140-6736(03)12489-0.
  3. “Complete Genomics Drives Down Cost of Genome Sequence to $5,000 “, John Lauerman, Bloomberg.com, Feb. 5, 2010. http://www.bloomberg.com/apps/news?sid=aEUlnq6ltPpQ&pid=20601124
  4. “Hospital-Acquired Infections”, Quoc V Nguyen, MD, Assistant Professor, Department of Pediatrics, New York State Health Department, Jan. 14, 2009. http://emedicine.medscape.com/article/967022-overview
  5. Probiotics are live microorganisms thought to be healthy for the host organism.
  6. Prebiotics are non-digestible food ingredients that stimulate the growth and/or activity of bacteria in the digestive system which are beneficial to the health of the host organism.
Mark Chew


Mark Chew presently leads the distributed generation policy and strategy at Pacific Gas and Electric Company in San Francisco. He joined PG&E in 2010 as an internal consultant, and he has also worked on demand-side management programs and forecasting distributed generation penetration. Mark received his MBA and MS in Chemical Engineering graduated from MIT; he also holds MS and BS degrees in Electrical Engineering and Computer Science from UC Berkeley. While at MIT, Mark was a founding editor of the MIT Entrepreneurship Review and was a lead organizer for the MIT Energy Conference. Before MIT, Mark spent 4 years at Qualcomm designing RF chips now used in mobile devices, including the iPad 3 and iPhone 4, 4S, and 5.

Leave a Reply

Your email address will not be published. Required fields are marked *