Recently biocomputers are becoming feasible due to advancements in nanobiotechnology and Synthetic Biology. Biocomputers use systems of biologically derived molecules—such as DNA and proteins—to perform computational calculations. It is expected that the most significant advantage of the DNA chip will be parallel processing.
Scientists are also using DNA for digital storage which is requirement for builiding biological computers. The use of DNA for digital storage is appealing in theory because DNA is ultracompact enough to store, replicate, and transmit massive amounts of information. Using DNA to archive data is an attractive possibility because it is extremely dense (up to about 1 exabyte per cubic millimeter) and durable (half-life of over 500 years).
Led by Yaniv Erlich, the team of engineers successfully stored and retrieved 214 pentabytes of data (214,000 gigabytes) into DNA. They took advantage of the structure of DNA molecules, which look like twisting ladders denoted by the letters A, C, G, and T. This genetic sequence typically acts as a building block for living things, and if one can convert it into binary numbers 0 and 1, DNA molecules can encode almost anything. Of course, the process is not that easy because not all DNA sequences are robust enough, said Erlich. What’s more, not all data stored in DNA can be retrieved successfully.
There has been rapid improvement in the cost and time necessary to sequence and analyze DNA. In the past decade, the cost to sequence a human genome has decreased 100,000 fold or more. This rapid improvement was made possible by faster, massively parallel processing. Modern sequencing techniques can sequence hundreds of millions of DNA strands simultaneously, resulting in a proliferation of new applications in domains ranging from personalized medicine, ancestry, and even the study of the microorganisms that live in your gut.
Computers are needed to process, analyze, and store the billions of DNA bases that can be sequenced from a single DNA sample. Even the sequencing machines themselves run on computers. New and unexpected interactions may be possible at this boundary between electronic and biological systems. As a multi-disciplinary group of researchers who study both computer security and DNA manipulation, we wanted to understand what new computer security risks are possible in the interaction between biomolecular information and the computer systems that analyze it.
As DNA-based computing, digital storage and DNA sequencing technology matures, the threat of DNA-based computer attacks is also growing. In 2017, a group of researchers from the University of Washington showed that it’s possible to encode malicious software into physical strands of DNA, so that when a gene sequencer analyzes it the resulting data becomes a program that corrupts gene-sequencing software and takes control of the underlying computer.
Experts predict that malware may become like biological Viruses that insert their genetic code into the DNA of infected organisms, causing the DNA to reproduce the viruses instead of synthesizing the right proteins, which are vital.
Therefore there is need to study the security threats in DNA Biocomputing and develo Biosecurity tools and processes. “One theme from computer security research is that it is better to consider security threats early in emerging technologies, before the technology matures, since security issues are much easier to fix before real attacks manifest,” point out researchers. “We encourage the DNA sequencing community to follow secure software best practices when coding bioinformatics software, especially if it is used for commercial or sensitive purposes. Also, it is important to consider threats from all sources, including the DNA strands being sequenced, as a vector for computer attacks.”
Recently, The Pentagon has reportedly instructed members of the US military to avoid using at-home DNA testing kits, citing concerns of mass surveillance and the potential for private companies to “exploit genetic materials for questionable purposes”.
Threat of DNA Malware
DNA stores standard nucleotides—the basic structural units of DNA—as letters such as A, C, G, and T. After sequencing, this DNA data is processed and analyzed using many computer programs. It is well known in computer security that any data used as input into a program may contain code designed to compromise a computer. This lead us to question whether it is possible to produce DNA strands containing malicious computer code that, if sequenced and analyzed, could compromise a computer.
To assess whether this is theoretically possible, we included a known security vulnerability in a DNA processing program that is similar to what we found in our earlier security analysis. We then designed and created a synthetic DNA strand that contained malicious computer code encoded in the bases of the DNA strand. When this physical strand was sequenced and processed by the vulnerable program it gave remote control of the computer doing the processing. That is, we were able to remotely exploit and gain full control over a computer using adversarial synthetic DNA.
The researchers started by writing a well-known exploit called a “buffer overflow,” designed to fill the space in a computer’s memory meant for a certain piece of data and then spill out into another part of the memory to plant its own malicious commands.
DNA sequencers work by mixing DNA with chemicals that bind differently to DNA’s basic units of code—the chemical bases A, T, G, and C—and each emit a different color of light, captured in a photo of the DNA molecules. To speed up the processing, the images of millions of bases are split up into thousands of chunks and analyzed in parallel. So all the data that comprised their attack had to fit into just a few hundred of those bases, to increase the likelihood it would remain intact throughout the sequencer’s parallel processing.
When the researchers sent their carefully crafted attack to the DNA synthesis service Integrated DNA Technologies in the form of As, Ts, Gs, and Cs, they found that DNA has other physical restrictions too. For their DNA sample to remain stable, they had to maintain a certain ratio of Gs and Cs to As and Ts, The result, finally, was a piece of attack software that could survive the translation from physical DNA to the digital format, known as FASTQ, that’s used to store the DNA sequence. And when that FASTQ file is compressed with a common compression program known as fqzcomp—FASTQ files are often compressed because they can stretch to gigabytes of text—it hacks that compression software with its buffer overflow exploit, breaking out of the program and into the memory of the computer running the software to run its own arbitrary commands.
While that attack is far from practical for any real spy or criminal, it’s one the researchers argue could become more likely over time, as DNA sequencing becomes more commonplace, powerful, and performed by third-party services on sensitive computer systems. And, perhaps more to the point for the cybersecurity community, it also represents an impressive, sci-fi feat of sheer hacker ingenuity.
“We know that if an adversary has control over the data a computer is processing, it can potentially take over that computer,” says Tadayoshi Kohno, the University of Washington computer science professor who led the project, comparing the technique to traditional hacker attacks that package malicious code in web pages or an email attachment. “That means when you’re looking at the security of computational biology systems, you’re not only thinking about the network connectivity and the USB drive and the user at the keyboard but also the information stored in the DNA they’re sequencing. It’s about considering a different class of threat.”
US military instructed not to use DNA testing kits over concerns of ‘mass surveillance’
In a Dec 2019 memo issued to the armed forces, senior officials at the US defence department noted how such tests “have varying levels of validity, and many are not reviewed by the Food and Drug Administration before they are offered”. “Moreover, there is increased concern in the scientific community that outside parties are exploiting the use of genetic materials for questionable purposes, including mass surveillance and the ability to track individuals without their authorization or awareness”, read the memo, obtained by Yahoo News.
DNA testing kits have grown increasingly popular over the years, despite concerns that the companies behind the products can sell an individual’s most personal information to third parties. Recent estimates indicate over 26 million people have used at-home testing kits, with as many consumers purchasing the DNA tests in 2018 as all previous years the products were on the market combined, according to MIT Technology Review.