By Samuel W. Anderson

You can get your entire genome sequenced for a few thousand dollars, but the information you get back will be worthless.

All right, "worthless" may be a bit misleading. Your whole genome sequence has the potential to bring you real, tangible benefits-even, in some cases, the life-saving sort-but it's going to take some additional work beyond just getting the code. Quite a lot of additional work, it turns out.

If you've seen The Matrix, you might remember the characters watching those green strings of code trickling down their computer screens. The code is essentially a readout of the inner workings-the genome, if you will-of "the matrix" (a virtual cyber-world where everyone does kung fu, for those of you who haven't seen the movie). A few of the characters are savvy enough to know in great detail what's going on in the matrix just by reading those bits of raw code on a computer screen.

The human genome isn't like that. You're never going to be able to eyeball your raw genomic code-over three billion pairs of A's and T's and G's and C's-and have any idea what the heck it says about you. You're going to need some help with this.

Because we tend to talk about whole genome sequencing casually or conceptually, it's easy to forget just how big the human genome is. I did it in just the last paragraph, in fact. There it is, quietly nestled between hyphens: three billion base pairs.

How big is that, really? If you are so enthralled by this issue of GeneWatch that you find yourself reading it from cover to cover, you will have read around 135,000 letters, numbers, and punctuation marks. In order to get to three billion characters, you would need to read another 22,222 issues, or about 800,000 pages.

Now let's say this magazine is published in Slovak (and let's assume you don't speak Slovak). In order to make sense of those three billion characters, you'll need some sort of key, like an English-Slovak dictionary. Better yet, you could just plug the whole thing into Google Translate. There will be a few minor errors, but you'll get the gist of it.

Once again, though, the human genome doesn't really work this way. Its alphabet may only be four letters, but Genomese is an immense and terribly complex language. The dictionary is still in development and may never actually be complete. We can identify nearly all of the letters on those 800,000 pages with impressive accuracy, and we know where to find certain scraps of important information; the rest is still Slovak to us.

