An early warning for synthetic biology: beware of "malicious DNA intrusion" by computer hackers

image

Scientists at the Institute of Protein Design at the University of Washington are using the software to model and create new vaccines. Picture source: INSTITUTE FOR PROTEIN DESIG

Author: Wu Wenhao

Every year, the commercial DNA synthesis industry delivers billions of nucleotides (nucleotides) to orderers each year, with a turnover of hundreds of millions of dollars. As DNA synthesis becomes more and more common in related fields, is there anything The important thing is ignored?

In a "letter to the editor" recently received by Nature Biotechnology, a group of Israeli researchers put forward a crazy-sounding idea: whether computer hackers will trick scientists in the field of synthetic biology to create a malicious or potentially risky malicious Gene fragment?

image

Think about it carefully, this worry is not a vain source.

In various fields, network information security is getting more and more attention, but most biological laboratories for academic purposes generally lack an effective firewall and network information security infrastructure to ensure the security and integrity of their stored information. Once illegal hackers take the opportunity to maliciously tamper with DNA data and try to spread the negative impact, the consequences are likely to be disastrous. For example, when malicious programs are inserted in the production process of vaccines or pharmaceuticals, the tampered DNA synthesis order information may be used to produce pathogenic organisms or nucleic acids of harmful proteins and toxins under the condition of "masking people's eyes and eyes".

A simple attack on the production process may lead to catastrophic consequences.

Dangerous "trap"

Suppose that A is a biological researcher in an academic institution, and for research purposes, he places a sequence order with the DNA synthesis company where B is located. In this process, DNA sequence editing software and commonly used DNA sequence files are saved The format cannot play the role of encrypting files. Like most current researchers in the biological and medical fields, A hopes to have a higher output. He does not consider the cumbersome network security considerations that will affect his own "productivity". Heart.

At this time, a cybercriminal C who targeted A appeared. Since the current production process of DNA synthesis sequence is not strong against cyber attacks, C can easily infect and control A’s computer with malware and replace the order The sequence is replaced with a malicious sequence, and the malicious sequence is confused with a common malicious code in the field of network attacks, and the malicious sequence is disguised as a normal sequence. If the confusion is successful, the 200 consecutive base pair subsequences extracted during the matching process will all be displayed as normal in the result, so that B cannot see the difference when comparing the sequences, and this confusion can be Later, it was "reversed" through CRISPR-Cas9 (gene editing tool)-based sequence deletion and homology repair, turning the "normal" sequence back into a "malicious sequence".

B After the comparison, the sequence is deemed "normal" and production is carried out. A sequencing report will be attached when delivered, and the report will assume that there is no problem with the sequence.

At this time, even if A seeks a third-party sequencing service out of prudent considerations, C can still tamper with the data that A submits to the sequencing company through malware. And if A, after mistakenly determining that the sequence is correct, uses CRISPR–Cas9 technology to modify the synthesized DNA, it will trigger the reverse process of the “malicious sequence” confusion process, turning the seemingly normal sequence back to the “malicious sequence”.

The core of the hypothesis of the entire hacker attack scenario is the software used by biologists to "print" DNA strands from scratch and then assemble them together. This process is generally called "DNA synthesis."

In recent years, we have seen this kind of synthetic software support a large number of breakthrough biomedical research. For example, in the wave of developing new crown vaccines, some large pharmaceutical companies are using artificial DNA strands as one of the components of their experimental vaccines.

image

Schematic diagram of the process by which criminals turn normal DNA sequences into "malicious sequences" through cyber attacks, source: Nature bio

Previously, researchers at the University of Washington had first proposed information security risks in the field of DNA synthesis in 2017, and the process was similar to the examples mentioned above.

At that time, people thought that this formulation was a bit advanced, and it may take some time before it becomes a problem that needs to be solved in this century.

In the second half of 2020, researchers from the Israeli Complex Network Analysis Laboratory confirmed the "realizability" of this threat through experiments, and wrote this article explaining the current information security risks facing the field of DNA synthesis. Edit letter".

This team refers to this type of attack surrounding the genetic research supply chain as "end-to-end cyber-biochemical attacks." Although they have not yet monitored the cases that have occurred in reality, it is only a matter of time before such incidents occur, especially as more and more genetic research moves toward higher levels of digitization and informatization.

Slightly outdated specification

Of course, someone has considered this situation before.

In terms of authoritative industry guidelines, the 2010 edition of the U.S. Health and Human Services Guidelines has required that manufacturers of DNA synthesis products need to compare the order sequence with the "problem sequence (dangerous and harmful sequence)" database stored in the database before actual production. The sequence of the bacterium is compared, and production can only start after the alignment is completed without overlap. Although most of the suppliers of synthetic DNA in the United States do also do this, unfortunately, the current database of pathogenic bacteria sequence is not complete, and " The 2010 edition of the US Guidelines for Health and Human Services "Requirements" can also be described as "long outdated."

Internationally, similar specifications include the 2009 version of the International Association for Synthetic Biology (IASB). Suppliers are required to compare the order sequence and record the information of the suspicious order and the information of the ordering party, but again, the timeliness of the specification is not "reliable".

The most recent document is the 2017 version of the International Gene Synthesis Society (IGSC) specification.

It requires the synthesis system to scan each subsequence in 200 consecutive base pairs (bp), use the "match rate" alignment method in the alignment process, and hand over to manual inspection after screening out suspicious sequences, but manual inspection Expensive and time-consuming, and if a comprehensive penetration test is not performed on the screening framework, some disease-causing sequences may become "missing fish" and escape censorship.

In this study, Israeli researchers successfully made the disguised "malicious sequence" escape review and enter the production process, and informed the International Gene Synthesis Association of the facts when the sequence was about to enter the production process, and then cancelled it for biosafety reasons Placed this order.

In addition to explaining that there are hidden dangers of cybersecurity information in the current DNA synthesis field, this team also proposed some possible solutions, such as the synthesis system can implement cybersecurity protocols, such as adding an electronic signature to an order, and changing the signature (Such as heuristic signature, artificial intelligence behavior analysis) to identify any possible post-implantation of malicious code; reduce the current alignment standard of 200 consecutive base pairs of subsequences to the shortest required for the "reverse obfuscation process" The homology-oriented repair template length. Re-examine completed orders when there are new situations; strengthen data sharing so that malicious instructions that have been maliciously inserted on multiple synthesizers can be found; and strengthen legislation and supervision according to the above guidelines.

Only by increasing vigilance, when the boundary between the virtual world and the real world becomes more and more blurred, human society will not be overwhelmed by these new forms of security incidents.

*Refrence:
[1]https://www.nature.com/articles/s41587-020-00761-y.epdf?sharing_token=WrWwDN-FkOdBex9by7Avv9RgN0jAjWel9jnR3ZoTv0NL8O3FZQt7i2a40oTwYLJPFz184wQMd47k4I9vP_m_KxdkwgB8s3TjKL3CWbYnVQOvuMrx9ODaGZMU7jFPAVy78oCfVyrz0df15z716-fLDxeCHnkIcmF6s88n63V4muk%3D

[2]https://www.zdnet.com/article/this-new-cyberattack-can-dupe-scientists-into-creating-dangerous-viruses-toxins/

[3]https://www.wired.com/story/malware-dna-hack/*

About data combat faction

The data practitioner hopes to use real data and industry practical cases to help readers improve their business capabilities and build an interesting big data community.

image

Guess you like

Origin blog.csdn.net/shujushizhanpai/article/details/112896797