Bring on DNA hard drives. If we used DNA like we use magnetic tape to store data today, it’s theoretically possible to store all of the information humans have ever recorded in a space roughly the size of a double garage
Sharing their goals with /MIT Technology Review this week, Microsoft Research computer architects say they want to start storing their data on strands of DNA within the next few years, and expect to have an operational storage system using DNA within a data center by the end of the decade.
As antiquated as it seems, one of the best ways to store a lot of
information in a small space right now is good, old-fashioned magnetic
tape not only is it cheap, it’s rugged enough to hold information for up to
30 years, and can hold as much as a terabyte of data per roll.
But when we consider more data has been generated in just the past two
years than in all of human history, it seems even magnetic tape might not cut it in the next few decades.
A biological material such as DNA might appear to be an odd choice for
backing up large amounts of digital information, yet its ability to pack
enormous amounts of data in a tiny space has been clear for more than 70 years.
His suggestion famously inspired James Watson and Francis Crick to
determine DNA’s helical structure based on the research of Rosalind
Franklin, sparking a revolution in understanding the mechanics of life.
While strings of nucleic acid have been used to cram information into
living cells for billions of years, its role in IT data storage was
demonstrated for the first time just five years ago, when a Harvard
University geneticist encoded his book – including jpg data for illustrations – in just under 55,000 thousand
strands of DNA.
Since then, the technology has progressed to the point where scientists
have been able to record a whopping 215 petabytes (215 million gigabytes) of information on a single gram of DNA.
It might be compact, but recording data in the form of a nucleic acid
sequence isn’t fast. Or cheap.
Last year, Microsoft demonstrated its DNA data storage technology by encoding roughly 200 megabytes of data in the form of 100 literary classics in DNA’s four bases in a single process.
According to /MIT Review/ this process would have cost around US$800,000 using materials on the open market, meaning it would need to be thousands of times cheaper to make it a competitive option.
It was also incredibly slow, with data stored at a rate of about 400
bytes per second. Microsoft says it needs to get to around 100 megabytes per second to be feasible.
It’s not clear what efficiencies Microsoft may have found to lower the
costs of the process and speed it up, but new technologies have been
seeing the cost of gene sequencing drop in recent years, so its end of the decade target may be realistic.
Even then, it’s likely it would only be used in select circumstances for
customers willing to pay for a specialised storage solution – like
critical archives of medical or legal data – rather than as a
replacement for current large-scale storage methods.
But while we’re speculating, a somewhat more sci-fi use for DNA-based
data storage could one day involve living computers.
While Microsoft’s DNA storage solution will be based on chips, there’s
every possibility that future versions of storage could involve enzymes
or bacteria engineered to carry out computations.
Even outside of cells, DNA potentially offers novel ways to compute data
opening ways to rapidly crunch numbers for certain problems much as
quantum computers do for other areas of mathematics.
For now, it’s looking as if DNA has a solid role to play in solving a
very real problem that will only get worse.