Responding to an eruption in the science of human genetics, pharmaceutical and biotechnology companies are turning to the computer to mine a mountain of information in their search for new treatments for disease.
These companies are using sophisticated software to identify the genes and proteins responsible for a variety of human ills. One biotech firm found a rare gene defect that causes premature aging and early death. Another found a new way to treat patients in the final stages of heart failure. One company is sifting through human genes to find those that play a role in adult diabetes. Others are searching through catalogs of human enzymes to find promising drug targets to fight cancer and arthritis.
Many of these companies have relied on free, publicly available computer programs for sorting through huge libraries of electronic data, or have developed specialized software of their own. But the pace of discovery has accelerated, and many are looking outside to buy software packages that can make sense of the growing mass of information.
In recent months, that's meant a booming business for software firms that have mastered developments in genetic engineering--with several licensing their programs to major drug companies.
The application of computers to genetics has been spurred by the Human Genome Project--the multibillion-dollar attempt to provide a detailed map of human heredity by 2005. Together with commercial research efforts that may be moving even faster, the federally funded project aims to spell out all 3 billion genetic letters that make up an individual's DNA.
That's enough information to fill a thousand phone books of a thousand pages each. Without computers, the pages would be largely gibberish--page after page written out in the four letters of the genetic code--GCTA--without spaces between the words and little in the way of punctuation. No solitary researcher, working with notebook and pencil at his laboratory bench, could glean much of value from such a genetic labyrinth.
"The goal is to find the genes among a gigantic library of books," said Pierre Baldi, chairman of Net-ID, a small genetics software company with offices in Los Angeles and San Francisco. "With the naked eye, you cannot make any sense out of it."
This melding of biology and computer science has been awkwardly christened "bioinformatics."
Many of the firms that are building the new software look more like high-tech start-upsthan biotech laboratories. Typically they have assembled teams that include biologists and computer programmers, as well as those rare individuals who can move with comfort between the two worlds.
"You need people who are trained across disciplines, who know biology and know computing," said Thomas G. Marr, president and scientific founder of Genomica Corp., in Boulder, Colo., which last month licensed its software to drug giant Glaxo Wellcome. Marr is the prototype of the new computer-geneticist--he has degrees in systems engineering and biology.
A Gargantuan Task
Putting together all the fragmentary knowledge of human genetics "couldn't be done without the computer," he said. "It was too big a job; there were too many things that need to be compared."
Without sophisticated computer software, "it's like emptying the ocean with an eye dropper," said Joel Bellenson, one of the co-founders of Pangea Systems of Oakland, which in July licensed its systems to Eli Lilly & Co. "We're trying to provide [our customers] with aqueducts and pipelines."
At one level, the software provides a convenient way of organizing and retrieving data from a variety of sources--like a computerized catalog that lists the contents of a large number of libraries.
But the software has become much more ambitious than that by using tricks developed for military targeting and voice recognition. To determine a gene's function, it can search out similar genes that have been identified in a variety of species. Or it can identify a series of genes that work together inside specialized human tissues such as the liver, the brain or the pancreas. Or zero in on differences between normal cells and those that have been transformed by cancer.
And the most ambitious software attempts to predict the three-dimensional shape of the proteins produced by genes and then identify potential drugs that could fit into the spaces of the larger molecules and block their action.
At Metabolex Inc., a biotech company in Hayward, scientists are using the Pangea software to seek out the many genes involved in adult diabetes. The plan is to find genes activated in normal cells, compare them with those found in diseased cells and eventually design drugs that will treat the diabetes.
"We're moving from . . . one postdoctoral fellow working on a single gene, to looking at what happens to 20,000 genes," said John Blume, who heads genomics research at Metabolex. "When you're working on something smaller, you could do it with paper and pencil at your desk, but with an organism as large, as immense as the human genome, with 3 billion base pairs and a huge number of genes, you can't do this."
At the Bayer Corp., scientists used special software to search public and private data banks for the human counterpart to a Bayer drug called Trasylol, which is derived from animal tissue. The animal protein is used to reduce bleeding during open-heart surgery. The newly discovered human compound, named Bikunin, appears to have a number of functions, and Bayer believes it shows promise as an anti-cancer drug.
With considerable luck and painstaking work, it might have taken "10 years or more to find this molecule, and that's with intensive care, NIH grants and 25 post-docs," said Senior Vice President Wolf-Dieter Busse at the Bayer research facility in Oakland. "Now one person finds the molecule and finds it in a couple of months."
In 1996, researchers at Seattle-based Darwin Molecular Corp., now Chirosciences R & D Inc., isolated a single gene from a family suffering from Werner's syndrome. The rare disorder results in short stature, premature aging and a shortened life span. "By age 15, they still look quite normal," said Chirosciences President and Chief Scientific Officer David J. Galas. "By the age of 40, they look like they're 80. They have gray and wrinkled skin."
Other scientists had already zeroed in on a sizable region of chromosome 8, one of the 24 different chromosomes that make up the human genome. To pick out the gene itself--a stretch of several thousand genetic letters in a region of DNA containing about a million--the company "just started marching from one end to the other," said Galas, who, as director of health and environmental research at the Department of Energy, was in charge of the Human Genome Project in the early 1990s.
Designing New Drugs
The Darwin scientists, using statistical analyses to check out various locations, identified several genes in the suspect region and examined them for mutations. The defective gene they found, Galas said, was the first ever implicated in human aging. The healthy gene produces an enzyme, one member of a family of proteins that unwind the DNA helix. A year ago, Geron Corp. of Menlo Park, licensed the discovery in the hope of finding treatments for age-related diseases.
Using its own software, San Diego-based ImmunoPharmaceutics focused on a naturally occurring human chemical called endothelin, which causes contraction of blood vessels in the lungs of congestive heart failure patients. The computer program helped the company design a drug to block those effects.
The company was bought out by Texas Biotechnology Corp., which is continuing tests of an improved version of the drug. Early results in 48 patients have been positive, said company spokeswoman Pamela Murphy.
Former ImmunoPharmaceutics executive Edward T. Maggio has started his own company, Structural Bioinformatics, which specializes in the design of chemicals for drug companies. Using supercomputers, such as the IBM machine that beat chess champion Garry Kasparov last year, the company claims to have greatly speeded up the process of drug discovery.
Speed is becoming an increasingly important factor for drug companies, said Manuel J. Glynias, president and CEO of NetGenics Inc., a bioinformatics company in Cleveland. He said NetGenics landed a major account by tackling a problem that took the client's computer group "three or four hours, but we were able to do it in eight minutes using our software." Among the company's customers is Abbott Laboratories.
Several of the bioinformatics companies do basic lab research of their own--developing libraries of gene sequences and identifying proteins that would make good drug targets for licensing to drug manufacturers. Gene Logic, which has facilities in Berkeley and Gaithersburg, Md., developed computer programs for its own use. Selling the software has become an independent business, said Gene Logics President Michael J. Brennan.
For large pharmaceutical firms "it makes more sense to buy data management and data integration tools rather than build it themselves," Brennan said. The firm's client list includes SmithKline Beecham and Procter & Gamble Pharmaceuticals.
Two of the largest bioinformatics companies also employ automated technology to find genes, sequence them and then sell the resulting libraries of gene sequences to pharmaceutical and biotechnology firms.
Connecticut-based Perkin-Elmer Corp. has formed a new company, Rockville, Md.-based Celera Genomics. Celera officials say they plan to produce a complete library of human genes within three years.
Incyte Pharmaceuticals in Palo Alto also offers software and access to its library of human gene sequences. Incyte CEO Roy A. Whitfield said his company intends to complete the job at an even faster pace--by the end of next year.
The specialized software and high-speed sequencing of genes are at the center of what Whitfield calls the "industrialization of medical research."
And with the explosion of raw information, he said, big drug companies will find themselves spending more on computer hardware and software than they do on gene sequencing.
Times staff writer Paul Jacobs can be reached via e-mail at email@example.com.
(BEGIN TEXT OF INFOBOX / INFOGRAPHIC)
How Computers Aid in Gene Research and Drug Discovery
1. Sequencing Genes: Today automated machines do the laborious job of spelling out, letter by letter, millions of tiny overlapping DNA fragments that make up the human genome. Computer software is needed to reject obvious misspellings, identify overlaps and piece together the bits of code in their proper order.
2. Comparing Diseased and Normal Tissue: Researchers test both the normal and diseased tissue for the presence of thousands of genes, and the software sorts out which genes are activated in the diseased cells.
3. Hunting for Proteins: By translating the DNA code, software programs describe the chemical makeup of a target protein produced by a newly discovered gene. Next, the software combs through databases to find similar substances and make intelligent guesses about the purpose of the newly discovered molecule.
4. Identifying Promising Drugs: Highly sophisticated programs try to visualize how a protein produced by a gene looks in three dimensions and then search for small molecules that might stick to the surface of the protein or fit into crevices in order to block the protein's action.