- Human pathology

Home > Technical section > Biology > Molecular biology > Genome > Databases > EnCode


Wednesday 26 December 2012

A decade long, $288 million study reported this year in more than 30 papers showed the human genome to be quite a bustling place, biochemically speaking.

The work—called the Encyclopedia of DNA Elements (ENCODE)—builds on the Human Genome Project, which deciphered the order of the bases that are our DNA’s building blocks and found that less than 2% of those bases defined genes.

ENCODE researchers took an intensive look not just at genes but at all of the DNA in between.

Their results drive home that much of the genome that at one time was dismissed as “junk DNA” actually seems to play an essential role, often by helping to turn genes on or off.

They pinpointed hundreds of thousands of landing spots for proteins that influence gene activity, many thousands of stretches of DNA that code for different types of RNA, and lots of places where chemical modifications serve to silence stretches of our chromosomes, concluding that 80% of the genome was biochemically active.

These details provide a much better road map for investigators trying to understand how genes are controlled.

Some researchers have already used these insights to clarify genetic risk factors for a variety of diseases, including multiple sclerosis and Crohn’s disease.

When these papers were published in September 2012, the media went wild. ENCODE was hailed in The New York Times as a “stunning resource” and “a major medical and scientific breakthrough” with enormous and immediate implications for human health.

The Guardian called it “the most significant shift in scientists’ understanding of the way our DNA operates since the sequencing of the human genome.”

But several scientists in the blogosphere called the coverage overhyped and blamed the journals and ENCODE leaders for overplaying the significance of the results.

For example, ENCODE reported that 76% of DNA is transcribed to RNA, most of which does not go on to help make proteins.

Various RNAs home in on different cell compartments, as if they have fixed addresses where they operate, suggesting that they play a role in the cell.

Critics argue, however, that it was already known that a lot of RNA was made, and that many of these RNAs may be spurious genome products that serve no purpose.

Likewise, one ENCODE researcher found 3.9 million regions across 349 types of cells where proteins called transcription factors bind to the genome—but again, it’s unclear how much of that binding is functional.

Nonetheless, ENCODE stands out as an important achievement that should ease the way for more insights into the genome.

By combining these data with sampling from another data-intensive effort, the 1000 Genomes Project, researchers discovered that 8% of our DNA appears with little variation throughout the human population—a strong sign that it was important for our evolution.

Overall, ENCODE’s newly discovered functional regions overlap with 12% of the specific DNA bases linked to higher or lower risks of various diseases, suggesting that the regulation of genes—not just the makeup of the genes themselves—might be at the heart of these risks.

Scientists have used this information to home in on relevant genes and cell types in several disorders.

Experiments can now unearth the molecular basis of these connections and, from there, identify potential treatments.

If that potential is realized, then ENCODE will have earned its accolades as a “stunning resource.”


A. Jha, “Breakthrough Study Overturns Theory of ’Junk DNA’ in Genome,” The Guardian, (5 September 2012).

E. Pennisi, “ENCODE Project Writes Eulogy for Junk DNA,” Science 337, 1159 (7 September 2012).

ENCODE Project Consortium, “An Integrated Encyclopedia of DNA Elements in the Human Genome,” Nature 489, 57 (6 September 2012).

G. Kolata, “ Bits of Mystery DNA, Far From ‘Junk,’ Play Crucial Role,” The New York Times, (5 September 2012).

L. Ward and M. Kellis, “Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions,” Science 337, 1675 (28 September 2012).

M. T. Maurano et al., “Systematic Localization of Common Disease-Associated Variation in Regulatory DNA,” Science 337, 1190 (7 September 2012).