Public consortium launches final phase of genome sequencing



The Human Genome Project international consortium has announced the official launch of the second and final phase of the human genome sequencing project -- the effort to decipher the 3 billion DNA letters that make the human body. The milestone marks the transition from the initial phase, generating a "working draft" of the human DNA, to the final phase, producing the complete finished sequence. Sixteen genome centers around the world, including the School of Medicine, officially began Phase two May 9.

Phase one, launched in March 1999, has produced coverage of the vast majority of the human chromosomes in 14 months, at a total cost of about $300 million. The last remaining DNA from this first phase already is in the centers' sequencing pipelines and will flow into public databases over the next six weeks.

The goal of the first phase was to create the working draft, covering 90 percent of the euchromatic portion of the human DNA, by sequencing large clones representing segments from the genome. Draft sequence allows scientists to identify directly the vast majority of the human genes, although the sequence itself still contains gaps and uncertainties.

The centers have so far produced and released sequences from overlapping clones containing a total of 3.2 billion DNA letters. Allowing for the overlaps, these segments cover approximately 85 percent of the human genome.

The remaining clones that will complete the working draft were selected in late April and now are in process at the 16 centers. The final data are flowing into public databases at a rate of 10,000 DNA letters per minute and will all be deposited by mid-June.

The working draft is assembled in a two-step fashion. Each clone is first assembled from its sequence information. The various clones can then be assembled together into a layout on the human genome, based on their chromosomal location.

The first comprehensive layout of the human genome was constructed in mid-April by scientists in the international consortium. The layout shows the chromosomal positions and the detailed relationships among the more than 20,000 large clones used to sequence the genome; it also spotlights the remaining segments to be covered.

"It's breathtaking to see the DNA sequences arrayed along the human chromosomes, from one end to the other," said Robert H. Waterston, M.D., Ph.D., the James S. McDonnell Professor of Genetics, head of the Department of Genetics and director of the Genome Sequencing Center at the medical school. "The individual contributions have fallen together to yield a global picture. We can now turn to plugging the remaining holes."

The sequence information from the working draft has been immediately and freely released to the world, with no restrictions on its use or redistribution. The information is scanned daily by scientists in academia and industry, as well as by commercial database companies providing information services to biotechnologists. Already, many tens of thousands of genes have been identified from the genome sequence.

For example, the working draft has allowed human geneticists to find genes responsible for dozens of inherited diseases -- including breast cancer, hereditary deafness, stroke, epilepsy, diabetes and various skeletal disorders. It also has propelled many basic biological studies. Researchers recently used it to discover the molecular basis of the sense of taste.

Phase two will involve producing a finished sequence of the human genome by filling the gaps and by increasing the overall sequence accuracy to 99.99 percent. (The working draft attains this level of accuracy at more than 90 percent of its DNA bases, but has somewhat greater uncertainty at the remainder of its positions.)

The process involves two activities:

Although working draft sequence allows for the recognition of genes themselves, the higher accuracy and completeness of the finished sequence makes it a gold-standard reference that can be readily compared to individual patients' DNA to identify specific single-letter mutations causing hereditary diseases.

In preparation for phase two, the international consortium has developed high-throughput methods for producing high-quality finished genomic sequence. In the process, approximately 20 percent of the human genome (600 million bases) has been finished to the high standard of 99.99 percent accuracy and completeness. The finished sequence of human chromosome 22 was published in December 1999, and the finished sequence of human chromosome 21 was published this month.

The international consortium has reaffirmed its commitment to immediate release of the phase two information into the public domain.

----------------------------------------------------------------------