Advertisement

Science / Medicine : mapping the Genome : Researchers prepare to take their best shot at translating the complicated chemical secrets of human inheritance.

Share
Times Science Writer

When geneticist Raymond White of the University of Utah announced in November, 1985, that he had narrowed the search for the defective gene that causes cystic fibrosis to a small area on one chromosome, he predicted that the gene would be found within two years. Identification of the gene, geneticists believe, will lead to new therapies for the hereditary lung disease.

But the task proved far more difficult than White had imagined. He and others are still looking for it.

If White had possessed the complete genetic blueprint of humans, said Robert K. Dressing, president of the Cystic Fibrosis Foundation, “there’s no question we would have found the gene by now.”

Advertisement

Dressing and many others would like to have that blueprint, or the identification of the 3 billion chemicals that comprise the complete genetic complement of humans--known as the human genome.

To obtain the blueprint, many of the country’s leading geneticists are proposing the largest single project in the history of biology: the Human Genome Initiative.

That project could involve hundreds of scientists and technicians working for 10 to 20 years and, by some estimates, could cost as much as $3 billion.

Scientists argue that the knowledge gained by having that blueprint would lead to new insight into the causes of the estimated 3,500 inherited diseases, as well as cancer, diabetes and heart disease, all of which are thought to involve a genetic predisposition. And with that insight, they say, may come new therapies, perhaps even cures.

Obtaining the blueprint of the genome is the Holy Grail of biology, according to Nobel laureate Walter Gilbert of Harvard University, the “ultimate answer to the commandment ‘Know thyself ‘.”

But research in biology has traditionally been conducted in groups of one to 20 people, often with minimal oversight from funding agencies. Hence the prospect of such a large project--which would require unusual coordination among the participants in order to ensure efficiency and minimize duplication--has generated widespread controversy.

Advertisement

Who will run the project? Where will the money come from? Might the resulting information be misused? For example, would employers or insurance companies discriminate against a person who is identified as a result of the genome project as having a genetic predisposition for cancer? Is the project even necessary?

Ten years ago, no rational scientist could have spoken convincingly about obtaining the precise identities of the 3 billion chemicals in the genome because the process was too tedious and technically demanding.

But the quest for a blueprint has been revolutionized by the advent of genetic engineering, the development of new techniques for separating large DNA fragments and great increases in the speed and power of computers.

The computers are especially important for identifying the order of fragments and for handling the massive amounts of information that is generated. “Sequencing the genome is a supercomputer project as much as a biology project,” said biophysicist Charles P. DeLisi of the Mt. Sinai School of Medicine in New York.

Genetic information in all life is contained in coiled strands of deoxyribonucleic acid (DNA.)

DNA is organized into small units called genes, which are bundled into much much larger units called chromosomes. Humans have about 100,000 genes in 23 pairs of chromosomes.

Advertisement

DNA is composed of four distinct chemicals called bases. These bases are strung together sequentially to form genes and chromosomes in the same fashion that letters of the alphabet combine to form words, sentences and books.

If the DNA in one human cell could be uncoiled and stretched out in a straight line, it would be about eight feet long. But if it were enlarged so that each base was the size of a letter in this article, the DNA would form a huge sentence, 4,700 miles long, stretching from Los Angeles to Jacksonville, Fla., and back again. So far, researchers have sequenced about 12 million bases, about 19 miles’ worth.

Researchers have two goals in the genome initiative--to create a “physical map” and a sequence.

A physical map would be a collection of perhaps 100,000 DNA fragments that span the entire genome and whose positions on the chromosomes are known.

A sequence would give the order of each of the 3 billion bases in the genome.

Mapping might be compared to the process of locating every city and town in the United States, while sequencing would be the equivalent of preparing a detailed street map of the entire country. Some critics of sequencing, such as geneticist Francisco J. Ayala of UC Irvine, argue that mapping will provide so much information that sequencing would not be necessary.

But while mapping may lead to identification of most genes associated with disease, others say, sequencing would provide much more information about how genetic information is expressed and about how cells work.

Advertisement

In essence, sequencing involves chemically removing one base at a time from the end of a DNA fragment and identifying each base as it is removed. An accomplished technician can sequence perhaps 1,000 bases per day.

Last year, Caltech biologist LeRoy E. Hood and his colleagues announced the development of an automated DNA analyzer that can sequence 10,000 bases per day. Within two years, he said in a recent telephone interview, the instrument may be able to handle 200,000 per day.

Researchers at the Institute of Physical and Chemical Research at Wako, Japan, have built a prototype of an analyzer they say will eventually be able to sequence 1 million bases per day, making it possible to sequence the whole genome in 10 years--if the financial support were there.

The cost of sequencing, as much as $1 per base only two years ago, is now 6 to 8 cents per base, Hood said, and could drop to as little as one cent in six months, as improvements in automation continue.

But scientists cannot simply jump in and start sequencing each chromosome at one end, said Cassandra Smith of Columbia University in New York. “To study chromosomes, you have to break them up into small pieces that can be manipulated in the laboratory,” she explained. Researchers thus must begin by making a physical map of the genome.

Smith and Charles Cantor of Columbia are mapping chromosome 21, the smallest human chromosome and hence the easiest to work with. Among other things, this chromosome contains the genes for several interferons, proteins that help fight off viruses.

Advertisement

To map a chromosome, researchers begin with perhaps 100 million cells--about an ounce. These can be white blood cells, cultured skin cells or placental cells. The cell membranes are ruptured with detergents and enzymes to free the chromosomes, which are then separated with a laser-based sorter.

The researchers use special enzymes to break the chromosome into a small number of pieces, perhaps 50. Using genetic engineering techniques, each of these fragments is inserted into the DNA of a yeast cell, producing 50 separate cell lines. When the yeast reproduce, they generate large amounts of the fragment; they also provide a “handle” for manipulating the fragments.

The researchers use sophisticated matching techniques to determine where each fragment is located on the chromosome.

This process must be repeated on each of the fragments, at least once, producing ever-smaller fragments. “The idea is to divide and conquer,” Smith said.

The map then is a collection of yeast cultures containing DNA fragments of characteristic sizes, as well as a guide to where the fragments are located.

Scientists at Lawrence Berkeley Laboratory have already used this procedure to map chromosome 16, which carries, among other genes, those for some forms of kidney disease and leukemia. Researchers can use this information in the search for other genes on the chromosome.

Advertisement

Berkeley scientists also are working on chromosome 19, which carries genes that repair DNA damaged by radiation.

On Oct. 7, researchers from the Massachusetts Institute of Technology and Collaborative Research Inc. in Bedford, Mass., announced that they had made a map composed of 404 pieces of DNA, called genetic markers, scattered more or less evenly across the genome. The map will be useful for identifying genes linked to diseases--but not as useful, they conceded, as a physical map because it is not as detailed.

But a flap followed the announcement, suggesting that the harmony among biologists which has led to rapid progress in mapping the genome may be giving way to increased competitiveness. Utah’s White, for example, argued that Collaborative’s map has too little detail and too many gaps to be considered the first genome map and that his own map is better.

Last month, the U.S. Department of Energy directed its labs in Berkeley and in Livermore to expand their efforts to map the genome and its Los Alamos National Laboratory in New Mexico to develop improved sequencing technology and data handling techniques.

DOE’s immediate goal is to have a physical map of the entire genome within five years, said David Smith, head of the department’s office of health and environmental research. That will require nearly doubling its mapping budget to about $20 million annually, he added.

“We are not proposing that we (sequence the genome),” Smith said, “but that we create the resources and technologies that would allow this to be done.”

Advertisement

Some scientists would prefer that the National Institutes of Health, with its long history of funding biological and medical research, take the lead in the Genome Initiative. But others note that NIH already does not have enough money to fund all worthy research. “We don’t want NIH . . . running this,” said Gilbert, because “their funds for other research would be cut back.”

Biologists want Congress to appropriate more money specifically for a sequencing project, but so far Congress has not shown an inclination to do so.

Ayala argues that a sequencing project is unnecessary and that mapping will provide all the benefits promised by sequencing enthusiasts at perhaps a tenth of the cost.

Mapping efforts, in fact, have already led to the identification of DNA fragments linked with specific diseases, such as Huntington’s disease, cystic fibrosis, Alzheimer’s disease, Duchenne muscular dystrophy, manic-depressive illness and other disorders.

These “genetic markers” can be used in prenatal screening for these diseases. The markers also point the way to identification of the specific genes and the biochemical defect involved, which may in turn lead to therapies for the diseases.

Even if it would be useful to have the sequence of every gene, Ayala added, those genes account for only 10% of the genome. Sequencing the other 90%, he said, “is not likely to provide meaningful insights.”

Advertisement

But Nobel laureate Paul Berg of Stanford disagrees. “I call it arrogance to assume we already know what is junk and what is real. My premise is . . . that information which now appears to be undecipherable will, in fact, have substantial meaning in terms of understanding how the human genome works.”

Some scientists also fear that information developed as a result of the initiative could be used inappropriately.

“If an insurance company finds out that you have a predisposition to a heart attack or cancer at age 50, will they insure you?” asked Stanford computer analyst Douglas Brutlag, who is developing computer systems for handling genetic information and expresses a widely shared concern. “Or will they increase your rate? Who has a right to know your genetic makeup? That’s a tough question.”

Nonetheless, said Brutlag: “We will sequence the human genome. Society should start thinking now about how to handle the information.”

BUILDING A BLUEPRINT OF HUMAN GENETICS

Geneticists are proposing the largest single project in the history of biology, the Human Genome Initiative. The project could involve hundreds of scientists over the next 10 or 20 years, and cost as much as $3 billion. It is aimed at identifying the 3 billion chemicals that comprise the complete genetic complement of humans--the human genome. Such knowledge could give insight into the causes of more than 3,500 inherited diseases as well as cancer, diabetes and heart disease.

To understand the different stages of the project, it is helpful to use geographical mapping as an analogy. There is disagreement among biologists about how finely detailed a map is required.

Advertisement

1. Looking for a single gene inside a cell is like looking for a street address in Los Angeles on a globe of the world. It is difficult even to definitively locate the area. Scientists use enzymes and detergents to split open 100 million cells at a time, freeing the chromosomes. Laser-based sorters separate chromosomes.

2. The chromosome is broken into about 50 fragments. Looking at these fragments is like trying to find the street address on a map that shows only states; it gets you only a little closer. Each fragment is copied into a yeast cell, making it easier to work with. Matching techniques are used to determine the order of fragments on the chromosome.

3. The fragment is broken down further in a process called “mapping.” The information now available is like looking at a map showing the cities within a state to find the street address--there is much extraneous material, but it is much easier to locate the area in which the address can be found. These smaller units are separated and copied into yeast to determine the order within the fragments.

4. Using automated analyzers, the position and identity of each chemical within every fragment is determined. Researchers then know the sequence of the whole genome. This process is like making detailed street maps of each city, then combining them to make a street map of the entire country. This shows the location of a street address, but the amount of data accumulated is so massive that it can be interpreted only with large computers. But even with improvements in technology, the process is tedious and expensive.

SEQUENCING

The sequence of the genetic code is an alphabet of just four chemical bases that make up DNA; (T) Thymine, (A) Adenine, (C) Cytosine, (G) Guanine.

MAPPING

Once the fragments are separated, researchers can determine the order in which they appear on the chromosome. This collection of fragments becomes a “physical map” that can be used to locate a defective gene.

Advertisement
Advertisement