ResearchGate: What motivated this study?
Yaniv Erlich: As humanity produces data at faster rates each year, progress in traditional data storage technologies has dramatically slowed over the last five years. This means that we need to think about new approaches for data storage.
RG: How does your study fit into this effort?
Erlich: We showed that we can reliably store information on DNA, and that our organizing of information approaches “optimal packing,” meaning it is nearly impossible to fit more information on the same amount of DNA material. We stored a film, an operating system, and other types of data on DNA molecules.
RG: How did you achieve this?
Erlich: We mapped the bits of the files to DNA nucleotides. Then, we synthesized these nucleotides and stored the molecules in a test-tube. To retrieve the information, we sequenced the molecules. This is the basic process. To pack the information, we devised a strategy—called DNA Fountain—that uses mathematical concepts from coding theory. It was this strategy that allowed us to achieve optimal packing, which was the most challenging aspect of the study.
RG: Why did you choose to use DNA?
Erlich: DNA has several big advantages. First, it is much smaller than traditional media. In fact, we showed that we can reach a density of 215 Petabytes per gram of DNA! Second, DNA lasts for an extended period of time, over 100 years, which is orders of magnitude more than traditional media. Try to listen to any disk from the 90s, and see if it’s still good. Finally, traditional media suffers from digital obsoleteness. My parents have 8 mm tapes that are basically useless now. DNA has been around for 3 billion years, and humanity is unlikely to lose its ability to read these molecules. If it does, we will have much bigger problems than data storage.
RG: How long do you think it will be until DNA storage is available to the general public?
Erlich: I would guess more than a decade. We are still in early days, but it also took magnetic media years of research and development before it became useful.
RG: What other applications do you foresee?
Erlich: DNA is versatile, and molecular biology offers an extensive toolkit to manipulate it. This opens the possibility of using molecular biology tools to assist computing. Usually, it is the other way around!