Modern AI systems have reached human-level abilities on tasks spanning object recognition in photos, video annotations, speech-to-text conversion and language translation. Many of these breakthrough achievements are based on a technology called Deep Neural Networks (DNNs). DNNs are complex machine learning models with an uncanny similarity to the interconnected neurons in the human brain, giving them the capability to deal with millions of pixels of high-resolution images, representing patterns of those inputs at various levels of abstraction, and relating those representations to high-level semantic concepts. Today, machine learning (ML) is coming into its own, ready to serve mankind in a diverse array of applications – from highly efficient manufacturing, medicine and massive information analysis to self-driving transportation, and beyond. However, if misapplied, misused or subverted, ML holds the potential for great harm – this is the double-edged sword of machine learning.
But deception attacks, although rare, can meddle with machine learning algorithms. Subtle changes to real-world objects can, in the case of a self-driving vehicle, have disastrous consequences. McAfee researchers tricked a Tesla into accelerating 50 miles per hour above its intended speed by adding a two-inch piece of tape on a speed limit sign. The research was one of the first examples of manipulating a device’s machine learning algorithms. While a human viewing the altered sign would have no difficulty interpreting its meaning, the ML erroneously interpreted the stop sign as a 45 mph speed limit posting. In a real-world attack like this, the self-driving car would accelerate through the stop sign, potentially causing a disastrous outcome. This is just one of many recently discovered attacks applicable to virtually any ML application.
It is also vulnerable to adversarial attacks. For example, an adversary can change only a few pixels of an image of a cat, which could trick the AI into thinking it’s an ambulance. Or another form of threat are poisoning attacks, where adversaries tamper with an AI model’s training data before it is created in order to introduce a backdoor that can later be exploited via designated triggers such as a person’s voice or photo.
“Over the last decade, researchers have focused on realizing practical ML capable of accomplishing real-world tasks and making them more efficient,” said Dr. Hava Siegelmann, program manager in DARPA’s Information Innovation Office (I2O). “We’re already benefitting from that work, and rapidly incorporating ML into a number of enterprises. But, in a very real way, we’ve rushed ahead, paying little attention to vulnerabilities inherent in ML platforms – particularly in terms of altering, corrupting or deceiving these systems.”
To get ahead of this acute safety challenge, DARPA created the Guaranteeing AI Robustness against Deception (GARD) program in 2019. GARD aims to develop a new generation of defenses against adversarial deception attacks on ML models. Current defense efforts were designed to protect against specific, pre-defined adversarial attacks and, remained vulnerable to attacks outside their design parameters when tested. GARD seeks to approach ML defense differently – by developing broad-based defenses that address the numerous possible attacks in a given scenario.
“There is a critical need for ML defense as the technology is increasingly incorporated into some of our most critical infrastructure. The GARD program seeks to prevent the chaos that could ensue in the near future when attack methodologies, now in their infancy, have matured to a more destructive level. We must ensure ML is safe and incapable of being deceived,” stated Siegelmann.
DARPA announced in Sep 2018 a multi-year investment of more than $2 billion in new and existing programs called the “AI Next” campaign. one of the important areas identified was Adversarial AI. The most powerful AI tool today is machine learning (ML). ML systems can be easily duped by changes to inputs that would never fool a human. The data used to train such systems can be corrupted. And, the software itself is vulnerable to cyber attack. These areas, and more, must be addressed at scale as more AI-enabled systems are operationally deployed.
Guaranteeing AI Robustness Against Deception (GARD)
The growing sophistication and ubiquity of machine learning (ML) components in advanced systems dramatically expands capabilities, but also increases the potential for new vulnerabilities. Current research on adversarial AI focuses on approaches where imperceptible perturbations to ML inputs could deceive an ML classifier, altering its response. Such results have initiated a rapidly proliferating field of research characterized by ever more complex attacks that require progressively less knowledge about the ML system being attacked, while proving increasingly strong against defensive countermeasures. Although the field of adversarial AI is relatively young, dozens of attacks and defenses have already been proposed, and at present a comprehensive theoretical understanding of ML vulnerabilities is lacking.
“The field now appears increasingly pessimistic, sensing that developing effective ML defenses may prove significantly more difficult than designing new attacks, leaving advanced systems vulnerable and exposed,” according to the Defense Advanced Research Projects Agency’s description of a new AI defense program. With no comprehensive theoretical understanding of machine learning vulnerabilities, DARPA said, efforts to develop effective defenses have been limited.
The Guaranteeing AI Robustness against Deception (GARD) program aims to develop theories, algorithms and testbeds to help researchers create robust, deception-resistant ML models that can defend against a wide range of attacks, not just narrow, specialized threats. The program will use a scenario-based framework to evaluate defenses against attacks delivered via sensors, images, video or audio that threaten the physical and digital worlds or the data used to build the ML models.
GARD seeks to establish theoretical ML system foundations to identify system vulnerabilities, characterize properties that will enhance system robustness, and encourage the creation of effective defenses. Currently, ML defenses tend to be highly specific and are effective only against particular attacks. GARD seeks to develop defenses capable of defending against broad categories of attacks. Furthermore, current evaluation paradigms of AI robustness often focus on simplistic measures that may not be relevant to security. To verify relevance to security and wide applicability, defenses generated under GARD will be measured in a novel testbed employing scenario-based evaluations.
GARD’s novel response to adversarial AI will focus on three main objectives: 1) the development of theoretical foundations for defensible ML and a lexicon of new defense mechanisms based on them; 2) the creation and testing of defensible systems in a diverse range of settings; and 3) the construction of a new testbed for characterizing ML defensibility relative to threat scenarios. Through these interdependent program elements, GARD aims to create deception-resistant ML technologies with stringent criteria for evaluating their robustness.
GARD will explore many research directions for potential defenses, including biology. “The kind of broad scenario-based defense we’re looking to generate can be seen, for example, in the immune system, which identifies attacks, wins and remembers the attack to create a more effective response during future engagements,” said Siegelmann. GARD will work on addressing present needs, but is keeping future challenges in mind as well. The program will initially concentrate on state-of-the-art image-based ML, then progress to video, audio and more complex systems – including multi-sensor and multi-modality variations. It will also seek to address ML capable of predictions, decisions and adapting during its lifetime.
The project is split among three groups. One set of organizations will be looking at the theoretical basis for adversarial attacks on AI, why they happen and how a system can be vulnerable. Another group will be building the defenses against these attacks, and the last set of teams will serve as evaluators. Every six months, they’ll test the defenses others built by throwing a new attack scenario their way and looking at criteria like effectiveness and practicality.
Intel was chosen to lead the physical adversarial attacks aspect of the project, as DARPA saw promise in the company’s experience in simulating external environments for self-driving cars. Intel acquired Mobileye, a vehicle computer-vision sensor company, for $15 billion in 2017.
Intel’s currently focusing on the future — plugging in vulnerability holes and getting ahead of the threats downstream. “An important thing to know about this particular topic is this isn’t a today threat,” Jason Martin, a senior staff research scientist at Intel Labs, said. But it’s a rarity in research to be able to spend time worrying about tomorrow’s problems. “It’s a nice place to be; it’s not a ‘panic now’ sort of scenario,” he said. “It’s a ‘calmly do the research and come up with the mitigations.'”The existing mitigations against machine learning attacks are typically rule-based and pre-defined, but DARPA hopes it can develop GARD into a system that will have broader defenses to address a number of different kinds of attacks.
Chip maker Intel has been chosen to lead a new initiative led by the U.S. military’s research wing, DARPA, aimed at improving cyber-defenses against deception attacks on machine learning models. One of its most common use cases of Machine learning today is object recognition, such as taking a photo and describing what’s in it. That can help those with impaired vision to know what’s in a photo if they can’t see it, for example, but it also can be used by other computers, such as autonomous vehicles, to identify what’s on the road. Object detectors are a type of technology used to identify objects within an image or video using labels and bounding boxes.
While no known real-world attacks have been made on these systems, a team of researchers first identified security vulnerabilities in object detectors in 2018 with a project known as ShapeShifter. Led by School of Computational Science and Engineering (CSE) Associate Professor Polo Chau at Georgia Tech’s Intel Science and Technology Center for Adversary-Resilient Security Analytics (ISTC-ARSA), the ShapeShifter project exposed adversarial machine learning techniques that were able to mislead object detectors and even erase stop signs from autonomous vehicle detection.
The research so far, led by Duen Horng “Polo” Chau, associate professor of computing at Georgia Tech, has landed on an especially relevant takeaway: If you can’t make something invulnerable from an attack, then make it computationally infeasible. For example, in some cryptography systems, there’s some probability of an attacker figuring out the code key by using up considerable computing resources, but it’s so improbable that it approaches impossible. Martin wants to approach the defense of physical adversarial attacks in a similar way: “The hope is that the combination of techniques in the defensive realm will make the cost of constructing an adversarial example too expensive,” he said.
Jason Martin, principal engineer at Intel Labs who leads Intel’s GARD team, said the chip maker and Georgia Tech will work together to “enhance object detection and to improve the ability for AI and machine learning to respond to adversarial attacks.” During the first phase of the program, Intel said its focus is on enhancing its object detection technologies using spatial, temporal and semantic coherence for both still images and video. DARPA said GARD could be used in a number of settings — such as in biology. “The kind of broad scenario-based defense we’re looking to generate can be seen, for example, in the immune system, which identifies attacks, wins and remembers the attack to create a more effective response during future engagements,” said Dr. Hava Siegelmann, a program manager in DARPA’s Information Innovation Office. “We must ensure machine learning is safe and incapable of being deceived,” said Siegelmann.
Temporal coherence here relates to understanding of physics — things don’t typically suddenly appear or disappear out of nowhere. For example, if a self-driving car registers a human, a stop sign or another object flickering into its view and then vanishing, then a hacker could be tampering with its system.
Semantic coherence relates to meaning. Humans identify things as a sum of their parts — a bird comprises eyes, wings and a beak, for example. The research team’s plan is to incorporate a second line of defense into a sensing system — if it registers a bicycle, then it should next check for the wheel, handlebar and pedals. If it doesn’t find those components, then something is likely wrong.
Then there’s spatial coherence, or knowledge of the relative positioning of things. If an object detector senses people floating in midair, for example, then that should be a red flag. And for all three of these strategies, the team hopes to not only teach object detectors to flag an attack but also correct it.
“Our research develops novel coherence-based techniques to protect AI from attacks. We want to inject common sense into the AI that humans take for granted when they look at something. Even the most sophisticated AI today doesn’t ask, ‘Does it make sense that there are all these people floating in the air and are overlapping in odd ways?’ Whereas we would think it’s unnatural,” said Chau. “That is what spatial coherence attempts to address – does it make sense in a relative position?”
Intel and Georgia Tech are working together to advance the ecosystem’s collective understanding of and ability to mitigate against AI and ML vulnerabilities. Through innovative research in coherence techniques, we are collaborating on an approach to enhance object detection and to improve the ability for AI and ML to respond to adversarial attacks.” –Jason Martin, principal engineer at Intel Labs and principal investigator for the DARPA GARD program from Intel
In Feb 2020, DARPA awarded IBM Research scientists with a $3.4M grant which will run until November 2023. The project is initially awarded for one year, with extensions for up to four. We will develop open-source extensions of ART to support the evaluation of defenses against adversarial evasion and poisoning attacks under various scenarios, such as black- and white-box attacks, multi-sensor input data, and adaptive adversaries that try to bypass existing defenses. The research will be based on IBMs Adversarial Robustness 360 (ART) toolbox, an open-source library for adversarial machine learning – it’s essentially a weapon for the good-guys with state-of-the-art tools to defend and verify AI models against adversarial attacks
Of particular interest is the evaluation against adversarial attack scenarios in the physical world. In such scenarios, the attacker first uses ART to generate a digital object (e.g. an STL file). The digital object is then synthesized into a real-world one (e.g. the STL file is printed out with a 3D printer) and then mounted in the designated physical-world context. The next step is to re-digitize the real-world object, e.g. by taking pictures of it with a digital camera from different angles, distances or under controlled lighting conditions. Finally, the re-digitized objects are imported into ART where they serve as inputs to the AI models and defences under evaluation.
In order for the team to counter threats, it’s vital for them to proactively discover vulnerabilities that bad actors aren’t yet aware of. If they don’t, bad actors could end up with the tools to disassemble any new techniques they use. “Because we’re not convinced that we’ll necessarily find the perfect defense, we’re trying to advance the theory [and] figure out, ‘What are the limits?'” Draper said. “We’re going to try to defend them as best we can, make them as invulnerable as possible, but we also want to have enough of a theoretical background to develop the theory in such a way that we can tell people, when they’re deploying an AI system, the extent to which it may be vulnerable or not.”
In the end, however, robust machine learning will just be one layer in the never-ending arms race to keep computer systems secure. In this context, what is currently missing is not so much air-tight defenses, but theoretical guarantees about how long a system can hold out, similar to those available in encryption. Such guarantees can inform the design of a secure, layered defense, Biggio said. “This is very far from what we have in the AI field.”
“It is very likely that there may be no silver bullet,” agreed Draper. “What we want to do is at least make it very difficult for someone to defeat or spoof one of these systems.”
DARPA releases toolbox to test AI against attacks in Dec 2021
The Defense Advanced Research Project Agency has issued a set of tools to help artificial intelligence researchers improve the security of algorithms.
The Guaranteeing AI Robustness against Deception (GARD) program includes several evaluation tools for developers, including a platform called Armory that tests code against a range of known attacks. The tools are open for any researcher to use, and are intended to help prevent foreign powers from accessing databases and code that may be used in weapons.
“Other technical communities – like cryptography – have embraced transparency and found that if you are open to letting people take a run at things, the technology will improve,” said Bruce Draper, the program manager leading GARD, said in a release.
The GARD program includes tools to test against data poising, according to DARPA. One set of tools, Adversarial Robustness Toolbox (ART), started as an academic project, but has since been picked up by DARPA for further research.
“Often, researchers and developers believe something will work across a spectrum of attacks, only to realize it lacks robustness against even minor deviations,” the DARPA release states.
The GARD program has researchers from Two Six Technologies, IBM, MITRE, University of Chicago, and Google Research working on the program.
References and Resources also include: