The health care community and medical researchers would benefit from accurate and economical high-throughput DNA sequencing methods that can explore the complete human genome sequence. To this end, scientists are pursuing the development of DNA sequencing-by-synthesis methods based on polymerase reaction amplification.Researchers have explored the potential of genome sequencing via pyrosequencing — a sequencing-by-synthesis method based on detection of the pyrophosphate group generated when a nucleotide is incorporated in a DNA polymerase reaction. However, inherent difficulties with this technique have led researchers to seek simpler methods that detect reporters attached directly to the nucleotide itself. A sequencing-by-synthesis scheme that couples fluorescence detection with a chip format could potentially sequence the entire human genome on a single chip, but no one has reported complete success with this approach for unambiguous determination of DNA sequences.Researchers have described a method that could allow high-throughput DNA sequencing for a variety of clinical and basic science applications. The success of the method results from the design of the nucleotides incorporated into DNA strands during polymerase reaction (A). In the method, unique fluorophores are attached to each of four nucleotides, essentially allowing DNA sequencing by color (B,C).In the Dec. 26 issue of PNAS, investigators with the Columbia Genome Center at Columbia University College of Physicians and Surgeons and at Columbia University, both in New York, described a four-color DNA sequencing-by-synthesis method that addresses the typical challenges by using four cleavable fluorescent nucleotide reversible terminators. The nucleotide analogues have unique emissions, so detecting the fluorescence during a polymerase reaction essentially allows sequencing by color.“The key is knowing how to rationally design the building blocks for polymerase reaction,” said Jingyue Ju, the first author of the PNAS report and a professor and head of DNA sequencing and chemical biology at Columbia. “The fluorescent dye is as big as the nucleotide itself, so you have to select a location on the nucleotide where the modification by the fluorescent dye will still allow the nucleotide analogues to be recognized as substrates by DNA polymerase.”In sequencing-by-synthesis methods, an incoming nucleotide’s ability to behave as a reversible terminator for a DNA polymerase reaction is an important requirement to unambiguously identify the incorporated nucleotide before moving on to the next. A free 3'-OH group on the terminal nucleotide of the primer is necessary for the DNA polymerase to incorporate an incoming nucleotide. Therefore, if the 3'-OH group of an incoming nucleotide is capped by a chemical moiety, it will cause the polymerase reaction to terminate after the nucleotide is incorporated into the DNA strand. If the capping group is subsequently removed to generate a free 3'-OH, the polymerase reaction will reinitialize.Previous studies have focused on attaching fluorophores directly to, and thus capping, the 3'-OH group of the nucleotides. The 3' position of the deoxyribose is very close to the amino acid residues in the active site of the DNA polymerase, however. As a result, the large, bulky fluorophores prevented the DNA polymerase from recognizing such nucleotide analogues as substrates in these studies.Because of this sensitivity, the Columbia researchers attached unique fluorophores to the 5-position of two of the nucleotides (C and T) and to the 7-position of the remaining two (A and G) using cleavable linkers, and then capped the 3'-OH group using small chemical moieties. The nucleotide analogues thus designed were shown to be good substrates for DNA polymerase, which incorporates only a single nucleotide analogue complementary to the base on a DNA template.After incorporation, the unique fluorescence emission is detected to identify the incorporated nucleotide, and the fluorophore and the 3'-OH capping group are subsequently removed to allow the next cycle of the polymerase reaction to continue sequence determination.This method satisfies two important requirements for sequencing by synthesis. First, capping the 3'-OH group and allowing the nucleotide to terminate the polymerase reaction enables identification of the nucleotide — in this case, by detecting the fluorophore attached to it. Using reversible terminators in the building blocks affords simultaneous sequencing by synthesis of all four of the nucleotides — reducing the number of cycles necessary for sequencing by synthesis and increasing the accuracy of sequencing, as the four nucleotides compete in the polymerase reaction. In addition, removing the linker and the fluorophore after fluorescence detection increases the overall efficiency of the method.To test the method, the researchers first immobilized the DNA template on the surface of a chip and then allowed the nucleotide analogue complementary to the DNA template to incorporate into the primer. A PEG linker between the DNA template and the surface helped to avoid nonspecific absorption of the unincorporated fluorescent nucleotides, resulting in very low background fluorescence. Then, they added the DNA polymerase and the four fluorescent nucleotide reversible terminators to the surface-immobilized DNA. After washing the surface, they detected the fluorescence with a four-color laser scanner made by PerkinElmer Life Sciences of Boston. The scanner has four excitation lasers with wavelengths of 488, 543, 594 and 633 nm and emission filters centered at 522, 570, 614 and 670 nm.They repeated the procedure several times to acquire DNA sequencing data for a variety of DNA templates, and found that they could identify the sequences unambiguously from the raw fluorescence data — even without processing.The researchers hope that, by reporting the basic principle and strategy of the sequencing-by-synthesis method, they will stimulate others to refine and improve it — for example, by developing high-performance polymerases specifically for the cleavable fluorescent nucleotide terminators, and testing alternatives to the linkers and 3'-OH capping moiety used in the PNAS study. They also noted the possibility of combining the method with high-density DNA templates produced by immobilizing millions of different DNA templates on the chip surface using emulsion polymerase chain reaction. Thus, they could expect to achieve throughput of more than 20 million bases per chip, with high accuracy.Ultimately, the technique could contribute to development of the “thousand-dollar genome” platform. Sequencing an entire human genome still costs tens of millions of dollars, Ju explained. For this reason, both the biotechnology industry and the scientific community are working vigorously to develop approaches and technologies to dramatically reduce the costs of DNA sequencing — so that appropriate medicine as well as disease prevention and treatment based on an individual’s entire genome can be realized.The potential very high throughput and accuracy of the method described in the PNAS study could help to reduce these costs significantly, but still not nearly enough. The Columbia group continues to work toward this goal, however. “You can imagine the impact the thousand-dollar genome could have in guiding health care and personalized medicine,” Ju said.