.
Center for Biomolecular Science & Engineering: Promoting discovery and invention in the post-genomic age
Baskin School of Engineering
UCSC Home
Home People Research News & Events Academics Outreach Jobs
  You Are Here: Home > News > UCSC makes working draft of human genome sequence available publicly

NEWS & EVENTS
Top News Stories
News Archives
Events
For Journalists
CBSE in the News
RECENT HEADLINES

Researchers find new mode of gene regulation in mammals

Pfizer and QB3 form alliance to advance a broad range of research

Health sciences grad awarded $10,000 scholarship for medical school

Biochemist Seth Rubin named Pew Scholar in Biomedical Sciences

Microchip developed by UCSC engineer is helping restore vision to the blind

Three UCSC graduate students win big grants for biotechnology research

Richard Hughey honored for contributions to diversity

Lemons to lemonade: unique database recovers new insights from unsuccessful HIV vaccine trial

UC Santa Cruz awarded $7.2 million grant for stem cell research center

Feldheim research shows nature and nurture combine to form the right visual connections

 
 
 
 
 


  CBSE NEWS

.

UCSC makes working draft of human genome sequence available publicly

Thursday, July 6, 2000
Written by Tim Stephens

Researchers at the University of California, Santa Cruz, who performed the computer analysis to assemble a working draft of the human genome sequence have now posted their results on a UCSC web site (http://genome.ucsc.edu). Biomedical researchers throughout the world can now search the working draft for particular genes or DNA sequences of interest to them.


David Haussler led the UCSC effort to assemble the human genome sequence

A group led by David Haussler, professor of computer science and director of UCSC's Center for Biomolecular Science and Engineering, created a powerful new computer program to assemble the working draft from the sequence data obtained by the International Human Genome Project. The policy of the public consortium of scientists working on the Human Genome Project has been to release sequence information to the world as soon as possible. Until the working draft was assembled, however, the sequence data were only available in many small pieces.

"This is the first publicly available view of the human genome sequence tentatively placed in the order and orientation in which we think it occurs along the human chromosomes," Haussler said.

Haussler noted that after a 10-year effort involving many laboratories and more than 1,000 scientists, the human genome can now be downloaded from the UCSC web site in about an hour and a half by anyone with a DSL-speed Internet connection. The genome sequence is essentially a long string of As, Ts, Cs, and Gs, representing the chemical units of DNA, called bases.

The human genome consists of approximately 3.1 billion bases arrayed along the length of the chromosomes.

Scientists involved in the Human Genome Project have sequenced about 85 percent of the human genome and continue to generate new sequence data at a rapid pace. Haussler's group will rerun their computer analysis every few weeks, incorporating new data so that biomedical researchers will have immediate access to the most up-to-date assembly.

Although the current working draft still has some gaps and uncertainty in it, it is already extremely useful for most biomedical research purposes, said Alan Zahler, an associate professor of biology at UCSC. In many cases, researchers have identified part of a gene or have other clues to the gene's sequence. They can now use that information to search the genome for the rest of the sequence associated with the gene they are interested in.

In addition, researchers who know the sequence of one gene can search the genome for similar sequences to find related genes. Many genes are members of multigene families with similar and sometimes overlapping functions, Zahler said.

"Identification of all of the members of a gene family will give us a sense for how many genes with a certain role are present in the genome," Zahler said. "Before, it took long periods of experimentation to find out whether a gene in humans was a member of a larger family of genes."

As a test for completeness of the working draft, Human Genome Project scientists searched the draft for known genes associated with human genetic diseases and found 95 percent of those diseases had identifiable genes in the working draft.


Jim Kent designed and wrote the software used to assemble the working draft of the human genome

"The chances of finding a particular disease gene in the working draft are apparently quite good," Haussler said. "Technically, the working draft only covers about 85 percent of the genome, but in practice it appears to cover 95 percent of the disease genes."

The working draft is also an exciting beginning for scientists interested in understanding gene structure and organization in humans, Zahler noted.

"For the first time, we will be able to look at tens of thousands of genes at once and start to search for common themes in areas such as how classes of genes are turned on and off, how the information in genes is processed into a form that encodes proteins, and how that processing is regulated," he said.

Jim Kent, a graduate student working with Zahler, designed and wrote most of the software used to assemble the working draft, which was compiled from hundreds of thousands of fragments of various sizes.

"Imagine you had five copies of 'War and Peace' and one of 'Crime and Punishment,' you put them through a paper shredder, and then try to paste together a single copy of 'War and Peace' from the shreds," Kent said. "That job would be a lot like assembling the human genome, except that the genome runs to about a million pages."

Haussler said Kent's accomplishments will have a very real impact on science and medicine. "He has shown enormous talent and creativity in tackling this fundamental problem," Haussler said.

In addition to the UCSC web site, the working draft will also be available on sites maintained by the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/) and the European Bioinformatics Institute (http://www.ebi.ac.uk/). Both NCBI and EBI are major contributors to the computational analysis of the human genome data. Haussler has already sent them the current working draft and will continue to send them updated versions as new sequence data are incorporated into the analysis.

moreMORE... read the story in the San Francisco Chronicle

UCSC Home

© January 2005,
CBSE

Updated 7/2008