Headline: Researchers from Microsoft and the University of Washington has broken a new record, managing to store and retrieve 200 MB of data on strands of DNA.
In today’s technological age, tons and tons of data is being produced to the EMC Digital Universe Study expects there to be 44 trillion GB by 2020. With current storage devices, such as hard drives and even optical storage like DVD players, do not last more than a few decades before becoming unreliable. Additionally, storage devices have become smaller and smaller with smart phones and microchips, and as such lot of research has gone into optimizing capacity and longevity.
DNA is an incredibly dense storage medium, potentially squeezing in 5.5 petabits (125,000 GB!!!) of information per cubic millimeter. By that measure, according to University of Washington professor, Luis Ceze, all 700 exabytes of today’s accessible internet would fit into a space the size of a shoebox. Which is pretty exciting because as of now, there is not enough space to store all the data that exists now, not to mention the vast genomic/ bioinformatic data and video content being created in huge amounts every day.
The DNA that is in each human cell stretched out would be about 3 meters, 6 feet, in length. with the ten trillion cells in one individual, the total amount of DNA would travel to the sun and back four times. Because it is very durable, compatible, and not to mention accurate at holding and storing information since the dawn of life, it is a very valuable option for the repository of stored material.
Typically, the information of DNA is translated from nucleotide pairs (ATCG), to single stranded RNA (AUCG), into proteins. This is called the central dogma of microbiology. These proteins in biological systems ‘code’ for everything from eye color to the formation of enzymes that are part of the citric acid cycle (metabolism). the body has special methods of reading this information (ie DNA polymerase, ribosomes, tRNA) but how does this process translate into coding and storing digital information?
In DNA, the nucleotide pairs are adenine to thymine and cytosine to guanine. because there are only two pairs, this translates easily into binary 0 and 1. these zeros and ones are synthesized chemically using microarray and silicon technologies. (I am not too interested in this but here and here are places that talk about it). once made, the DNA is dehydrated and protected from heat and light so it is no longer reactive so in this form it can be stored for a really long time depending on the temperature of the freezer. To read the data, the DNA is resuspended and translated by a DNA sequencer.
But anyway, what was the 200GB that they encrypted into the DNA? they put the top 100 books of the guttenberg project, the universal declaration of human rights, and an HD music video by Ok Go. it’s always interesting what data groups choose to archive, but I think that these were really good options that show off what it means to be human – expressive literature, universal ethics, and an intricately engineered Rube Goldberg machine.
I am looking forward to the advances made with DNA storage and how it will be implemented in a more useful way. Having digital files stored in a negative eighty degree freezer that would require thawing, resuspension, and analysis is not the best way to quickly pull up a reference or a current project. But that 200GB can be stored inside a test tube that’s smaller than a pencil tip is incredible. in the past, I know that DNA has been used to make circuits and calculators, so it will be interesting to see the future of the use of DNA in general.
The reference to the golden records I think is original. The golden records are disks made of gold that are on board the Voyagers in space, with the hopes that alien life forms will find them and will learn about the human race. Now, instead of gold in space, the data is being stored in DNA and (so far) kept on earth. I suppose it could be injected into water bears and shot into space to replicate, but that seems a bit absurd. That being said, I am not sure that the synthetic DNA being added to an organism wouldn’t mess up the existing biological processes – not sure what kind of protein OkGo’s music video would make. But it would be cool to see what existing data files exist in binary in human DNA.
Leave a Reply