X Prize Foundation takes on the human genome


The X Prize Foundation, which offers monetary awards for solutions to pressing scientific challenges, has tackled space travel, moon missions and oil spill cleanups. Now it’s taking on the human genome. The Archon Genomics X Prize presented by Medco is challenging teams to accurately sequence the DNA of 100 centenarians within 30 days at $1,000 or less per genome.

The first team to complete the task successfully will receive $10 million, and the sequenced genomes will be published for use in research.

J. Craig Venter, a genome-sequencing pioneer and a principal player in the successful push to sequence the human genome, is co-chairman of the competition, which will begin in January 2013.


Venter, founder and president of the J. Craig Venter Institute, spoke with The Times about why modern medicine needs an affordable “medical grade” genome, and how the X Prize could speed its development.

How long have you been involved with the X Prize, and why did you get involved?

I originated this prize in 2003, but it was only a $500,000 prize.

After we finished sequencing the first draft of the human genome, pundits were saying there was no point going forward in DNA sequencing because the human genome was already sequenced. They missed the point: That was the start, not the end. The only way it would meaningfully improve medicine was if we had a very rapid, accurate method of DNA sequencing.

So we started the prize as a way to encourage people to invent new sequencing methods.

The X Prize Foundation asked me to consider merging the prizes, which I thought was a great idea. Participants were motivated by the half-million-dollar prize, so I can only assume they will be even more motivated by the $10-million version.

What is a medical-grade genome?

It’s a term we sort of invented. Most people don’t know that the existing sequencing technology, these faster and cheaper methods, have lower accuracy than we would need for true diagnostic sequencing.


To be a good diagnostic tool, it can’t have false positives or false negatives. You want it to be highly accurate, meaningful information if you’re going to direct people’s health plans to prevent diseases that they might be subject to.

None of the technology is there yet, with the exception of the Sanger sequencing technique, which was used for my own genome in 2007. But it’s very expensive. We need new technologies to improve the accuracy and lower the cost.

Who would the competitors be — people at universities, or at companies?

A little bit of both. Obviously there are companies that make and sell DNA-sequencing instruments. In my view, none of the existing technologies would be likely to win.

But at universities and new fledgling startups, people have invented some pretty awesome-sounding solutions that would be faster, lower costs and produce better-quality information.

Like what?


It’s a combination of mathematical approaches and technology. Today, you can get your genome sequenced for $4,000 or $5,000. The problem is, if you have your genome sequenced with any two technologies, they don’t completely agree on your genome sequence. That is the problem for the diagnostic part of this.

Different technologies have different problems. Some have trouble getting through unusual repetitive sequences in DNA. Others, for some reason — we don’t know why — will occasionally drop out a letter of the genetic code. That’s perhaps the most problematic.

Hopefully these newer technologies will have a more solvable set of problems. They almost seem like science fiction, because that’s what they would have been a few years ago.

The chunks of DNA that the relatively inexpensive current technologies analyze are short. Sometimes you can only get 100 letters of genetic code. It’s like shredding the Sunday L.A. Times: If you tear it into fairly large pieces, it isn’t that hard to sit down and eventually reconstruct the newspaper. But if you shred it into little tiny pieces, where you can’t see what the neighboring pieces are, it would be orders of magnitude more difficult to reassemble.

That’s sort of the issue with reading DNA. When you can read long pieces it makes the reassembly of the right sequence much more plausible than if it’s shredded into little tiny bits. A lot of the new technology has been going toward getting much longer reads, some in the multiple thousands of letters, which would be really phenomenal.

How will the foundation judge the teams?


They’ll have to sequence 100 genomes within 30 days, and they’ll have to do it accurately. It’s not clear that there have been 100 genomes sequenced yet today — certainly there are not that many in the scientific literature.

Our sponsor, Medco, is supporting the validation process, which will determine who has the most accurate and complete sequences.

One of the important elements of this is something called haplotype phasing. We all have two sets of chromosomes, one from our mothers and one from our fathers. Haplotype phasing is accurately saying, here’s the DNA you got from your mother, and here’s the DNA you got from your father.

Most people have asked these kinds of questions at some stage of their life: Did I get this trait from my mother or from my father? We’ll be able to answer those questions very precisely.

My father died at a young age, 59, of sudden cardiac death. My mother is 88 and still plays golf. I would kind of want to know who I got my key longevity genes from [laughs]. I don’t think any of the sequencing companies currently are able to do that.

Why sequence the genomes of centenarians?


The DNA sequence itself is only one component. Without associated information about traits — did your mother have this disease or did your father have it? — genetic information is meaningless.

Looking at centenarians is one way of saying, here’s 100 people with a very uniquely defined trait: remarkable old age. Obviously they have pretty good genes if they’ve lived to be 100.

This type of analysis is a critical part of the future. The reason why we need the cost to be low and the accuracy high is that we need to do about 10,000 human genomes as a base set and have it paired with people’s corresponding health information.

We want to do a search through the entire genome to understand the correlation between health and wellness and disease.

Is 30 days long enough to sequence 100 genomes?

Well, it needs to be, right? If we’re going to do tens of thousands and you can’t do 10 in 30 days, you’re not really providing a technology that contributes to the solution.


Ten years from now, where do you see this field?

I think having your genome sequenced will be such an important part of medicine.

One of the big advances right now is looking at the genetic changes in cancer. Now we know that there are cancer drugs that work very effectively in people with certain genetic mutations. One of the first things you will want to know if you have cancer will be if you have any of these genetic mutations.

It will be fast and cheap and you’ll have it done several times, because cancer is caused by accumulated genetic change. The DNA in your tumor could be very different than your core cell DNA, and it could continue to change over time.

This interview was edited for length and clarity.