DARPA’s GARD Program: Building Resilient Machine Learning to Counter Adversarial Attacks

Rajesh Uppal April 4, 2025 AI & IT, Global Risks & Future Threats Comments Off on DARPA’s GARD Program: Building Resilient Machine Learning to Counter Adversarial Attacks 1,156 Views

Modern artificial intelligence (AI) systems have reached near-human levels in various tasks, including object recognition, language translation, and even decision-making. Many of these breakthroughs rely on deep neural networks (DNNs), complex machine learning models that simulate the interconnected neurons in the human brain. DNNs excel in handling millions of pixels from high-resolution images, forming intricate patterns and transforming these into high-level concepts. Today, machine learning (ML) is driving advances across industries—from healthcare and manufacturing to autonomous transportation and more. However, with such immense power comes a critical challenge: ML systems, if exploited, have the potential for significant harm.

As artificial intelligence (AI) systems grow more sophisticated and deeply integrated into critical sectors, their security against deception and adversarial manipulation is essential. The Defense Advanced Research Projects Agency (DARPA) has taken a proactive stance through its Guaranteeing AI Robustness Against Deception (GARD) program. With a mission to enhance the resilience of AI systems, GARD tackles vulnerabilities, fosters innovative defenses, and builds a vibrant community dedicated to AI security. GARD focuses on creating ML systems that are invulnerable to adversarial attacks, making them reliable even when faced with deceptive, manipulated data. In doing so, it has become a cornerstone of the global effort to protect AI systems from adversarial threats and ensure they operate safely in real-world applications. This article delves into the challenges posed by adversarial attacks and how DARPA’s GARD program is advancing robust machine learning.

The Growing Threat of Adversarial Attacks in Machine Learning

As machine learning (ML) becomes more embedded in mission-critical systems across defense, finance, and healthcare, the resilience of these models to adversarial threats has become paramount. Adversarial attacks exploit the vulnerabilities in ML algorithms by introducing subtle but malicious modifications to input data. In computer vision, for instance, adding tiny perturbations to images can cause models to misclassify objects or ignore critical patterns, potentially compromising an AI’s performance in real-time decision-making scenarios.

“AI systems are made out of software, obviously, right, so they inherit all the cyber vulnerabilities — and those are an important class of vulnerabilities — but [that’s] not what I’m talking about here,” Turek told event attendees. Instead, the DARPA official said the relatively new and overly complex nature of AI systems has created a whole new arena of previously unseen vulnerabilities.

“There are sort of unique classes of vulnerabilities for AI or autonomous systems, where you can do things like insert noise patterns into sensor data that might cause an AI system to misclassify,” Turek said. “So you can essentially, by adding noise to an image or a sensor, perhaps break a downstream machine learning algorithm. You can also, with knowledge of that algorithm, sometimes create physically realizable attacks.”

Types of Adversarial Attacks

Adversarial attacks target the vulnerabilities of machine learning (ML) systems, exploiting weaknesses in how models process and interpret data. These attacks can significantly compromise the reliability and safety of AI-powered systems. Here’s a closer look at three common types of adversarial attacks:

Adversarial examples are subtle manipulations of input data designed to mislead ML models into making incorrect classifications or decisions. These changes, often imperceptible to humans, can have severe consequences. For instance, a slight pixel alteration in an image of a cat might cause an ML model to classify it as an ambulance. In a more alarming example, McAfee researchers demonstrated this vulnerability in a controlled environment by altering a speed limit sign with a small piece of tape. The manipulated sign was misread by a Tesla’s AI system, prompting the car to dangerously accelerate. Such examples highlight the potential for malicious interference to exploit AI’s reliance on precise data interpretation.

Data Poisoning

Data poisoning attacks focus on tampering with the training data used to build ML models. By injecting maliciously crafted data, attackers can introduce “backdoors” into the model. These backdoors remain dormant during regular operation but can be triggered later by specific patterns, such as a particular voice command, image, or data input. Once activated, the backdoor causes the system to behave in unexpected and potentially harmful ways, undermining trust and reliability in the AI system.

Deception Attacks

Deception attacks involve feeding false information to AI systems to manipulate their behavior. While less common, these attacks pose significant risks, especially in critical systems like autonomous vehicles. By deceiving a vehicle’s sensors or decision-making algorithms, attackers could cause it to make dangerous maneuvers, potentially endangering human lives. For instance, a carefully crafted adversarial input could trick a self-driving car into ignoring stop signs or swerving into other lanes.

These attacks are increasingly sophisticated, often bypassing even advanced defense mechanisms, which poses severe implications for fields reliant on ML, such as autonomous vehicles, cybersecurity, and defense. One example of these vulnerabilities involves AI targeting systems that use visual and sensor information to determine if a vehicle is a friend or a foe. In such a case, Turek says that simply adding a well-placed sticker to a friendly bus could trick the AI software into thinking it is an enemy tank and marking it for attack. The same situation exists in the reverse, where an enemy vehicle could be disguised with low-tech methods to make an AI targeting system think it is a friendly vehicle and not an immediate threat.

In response to these vulnerabilities, DARPA introduced the Guaranteeing AI Robustness against Deception (GARD) program in 2019. GARD aims to develop broad-based defenses capable of countering a wide range of adversarial attacks on ML models. The goal is not just to mitigate known attack vectors but to build resilience into ML models by making them inherently robust against a wide spectrum of adversarial techniques. This forward-thinking approach aims to ensure that ML models can maintain reliable performance under deception, elevating the field of ML security.

“The field now appears increasingly pessimistic, sensing that developing effective ML defenses may prove significantly more difficult than designing new attacks, leaving advanced systems vulnerable and exposed,” according to the Defense Advanced Research Projects Agency’s description of a new AI defense program. With no comprehensive theoretical understanding of machine learning vulnerabilities, DARPA said, efforts to develop effective defenses have been limited.

Key Components and Goals of the GARD Program

The GARD (Guaranteeing AI Robustness against Deception) program is driven by a mission to create machine learning (ML) models that can proactively detect, resist, and evolve in response to adversarial deception. This ambitious initiative is organized around several key goals aimed at fortifying ML systems against manipulation, with the ultimate objective of fostering resilience in real-world applications where adversarial attacks are increasingly sophisticated. Through a structured approach, GARD seeks to address the vulnerabilities of ML models and transform them into adaptive, self-defending systems that can withstand emerging threats.

A cornerstone of the GARD program is its focus on Adversarial Robustness. GARD aims to enhance ML systems to accurately identify adversarial input at both the data and model levels. This involves developing advanced algorithms that can scrutinize input data to detect potential manipulations, regardless of the subtlety of these alterations. By embedding mechanisms to recognize and mitigate adversarial attempts, GARD ensures that even if data is intentionally distorted, the model’s predictions remain accurate and reliable. This proactive stance minimizes the risk of models being misled by deceptive inputs, thereby strengthening their resilience against adversarial attacks.

Additionally, GARD places a strong emphasis on Understanding Model Vulnerabilities, aiming to uncover the mechanisms that make models susceptible to adversarial manipulation. Through rigorous analysis of attack methodologies and their effectiveness, GARD researchers identify patterns that reveal underlying weaknesses in ML architectures. These insights allow for the design of more inherently robust models that emphasize security and resilience by default, rather than relying on reactive patches. Complementing this robustness, GARD envisions Adaptability and Self-Improvement as integral to future ML systems. By incorporating the ability to self-adjust based on past attack experiences, GARD promotes dynamic, adaptable models that evolve over time. This adaptability not only fortifies defenses but also empowers models to recalibrate against new types of attacks, creating an enduring line of defense in an ever-shifting threat landscape.

Unlike existing defenses, which often target specific threats, GARD seeks to create adaptable defenses, addressing a spectrum of attacks without prior knowledge of their exact nature.

The Guaranteeing AI Robustness against Deception (GARD) program aims to develop theories, algorithms and testbeds to help researchers create robust, deception-resistant ML models that can defend against a wide range of attacks, not just narrow, specialized threats. GARD’s novel response to adversarial AI will focus on three main objectives: 1) the development of theoretical foundations for defensible ML and a lexicon of new defense mechanisms based on them; 2) the creation and testing of defensible systems in a diverse range of settings; and 3) the construction of a new testbed for characterizing ML defensibility relative to threat scenarios. Through these interdependent program elements, GARD aims to create deception-resistant ML technologies with stringent criteria for evaluating their robustness.

This innovative approach involves three main pillars:

Theoretical Foundations: GARD aims to establish new theoretical frameworks for understanding and mitigating ML vulnerabilities. By defining core principles, researchers hope to create defenses that apply to entire classes of attacks rather than isolated incidents.
Scenario-Based Testing: Traditional evaluations focus on specific metrics that may not fully represent real-world applications. GARD’s scenario-based framework, on the other hand, will rigorously test defenses against simulated physical and digital attacks. The program will use a scenario-based framework to evaluate defenses against attacks delivered via sensors, images, video or audio that threaten the physical and digital worlds or the data used to build the ML models.
Coherence-Based Defense Mechanisms: Inspired by biological systems, GARD is exploring “coherence” defenses, which mimic the human brain’s ability to detect inconsistencies. These defenses focus on:
- Temporal Coherence: Recognizing consistent behavior over time. For example, objects that appear and disappear erratically could indicate tampering.
- Semantic Coherence: Ensuring that recognized objects align with expected features—if an AI identifies a bicycle, it should also detect wheels and handlebars.
- Spatial Coherence: Ensuring objects appear in plausible locations. For instance, floating humans in an image should raise flags for potential manipulation.

Technical Innovations in Robust Machine Learning

The GARD program is making significant strides in advancing machine learning (ML) defense technologies, focusing on innovations that enhance the robustness of ML models against adversarial attacks. One of the most notable developments is the concept of Certifiable Robustness, where DARPA is committed to creating ML models that can offer verifiable guarantees of their resistance to specific types of attacks. This involves employing rigorous mathematical frameworks to certify the boundaries of a model’s vulnerabilities, ensuring that it can withstand known adversarial tactics. By integrating techniques such as robust optimization and adversarial training, researchers are embedding an intrinsic layer of defense into these models, which not only helps to improve their performance under adversarial conditions but also instills confidence in their reliability in real-world applications.

In addition to certifying robustness, GARD emphasizes Feature-Space Analysis as a critical innovation in understanding and mitigating adversarial threats. Rather than simply analyzing input data for anomalies, GARD researchers focus on the underlying features that models use to interpret this data. By examining how adversarial perturbations impact feature activations, researchers can train models to recognize abnormal patterns that may signal an attack. This approach enables ML systems to become more discerning, effectively preventing them from being misled by manipulated inputs. Through enhanced feature understanding, models can better differentiate between benign variations and malicious modifications, thus improving their overall accuracy and reliability.

Another innovative avenue being explored within the GARD program is the use of Ensemble and Hybrid Models. These systems leverage multiple algorithms to cross-validate each other’s outputs, enhancing overall decision-making reliability. In an ensemble setup, if one model identifies an anomaly while another does not, this discrepancy triggers a defense mechanism that safeguards the system against potential threats. This multi-model approach significantly reduces the risk associated with a single compromised model, creating a more resilient framework. Furthermore, GARD is developing hybrid architectures that combine rule-based systems with ML-driven approaches. By ensuring that critical decisions undergo double-checking through both methodologies, these hybrid models add an extra layer of security, effectively bolstering defenses against adversarial manipulation and enhancing the robustness of ML applications in diverse environments.

DARPA’s GARD Tools and Community Contributions

Beyond developing resilient ML models, GARD is equipping the AI and ML research community with practical tools and open-source resources. These include:

Armory: Armory is an open-source toolkit created by GARD to test and benchmark ML models against various adversarial attacks. Researchers and developers worldwide use Armory to evaluate their models’ resilience, ensuring they are better prepared to handle real-world threats.
Adversarial ML Threat Matrix: To streamline and standardize knowledge-sharing, GARD has partnered with Microsoft and other organizations to create the Adversarial ML Threat Matrix. This framework catalogs known adversarial tactics and techniques, providing researchers with a blueprint for understanding and defending against a wide array of attacks.

Real-World Impact of GARD’s Advancements

The breakthroughs within the GARD program have profound implications for any sector where ML is used. In defense, for example, resilient ML models can enhance autonomous systems like drones or missile defense, allowing them to operate reliably even when under sophisticated electronic warfare attacks. In cybersecurity, GARD’s advancements allow for intrusion detection systems that remain effective even when attackers attempt to camouflage their presence through adversarial manipulation.

Critical Infrastructure Protection

AI systems are increasingly deployed in critical infrastructure — from power grids and transportation networks to healthcare systems. GARD’s research has far-reaching implications for the security of these systems, which must operate reliably under various environmental and cyber conditions. By improving the resilience of AI, GARD helps to safeguard these vital systems from potential adversarial manipulation, ensuring continued functionality and safety.

Financial services also stand to benefit significantly from GARD’s robust ML. Fraud detection models become more reliable, even as fraudsters attempt to evade detection through subtle data manipulations. Similarly, in healthcare, diagnostic algorithms can become more trustworthy by ensuring adversarial manipulation of medical data does not lead to erroneous diagnoses, protecting patient outcomes.

Cybersecurity
In the realm of cybersecurity, robust AI is indispensable. GARD’s contributions to adversarial defenses help protect AI-driven security systems from cyberattacks that could lead to significant breaches and data loss. These advancements are essential in a world where cyber adversaries are constantly refining their tactics, allowing security systems to stay one step ahead.

National Security
With a global race to leverage AI for defense and security, the robustness of AI systems has become a matter of national security. GARD’s work directly supports the defense sector by ensuring that military AI applications — from autonomous drones to surveillance systems — can withstand adversarial interference. As countries compete for technological superiority, GARD’s developments in robust AI grant the U.S. a critical edge in safeguarding its national interests

Collaboration and Knowledge Sharing

Intel and Georgia Tech are among the GARD collaborators working to bring these advancements to life. For example, Intel is leveraging its expertise in object detection to develop defenses that incorporate temporal, spatial, and semantic coherence. These techniques aim to make attacks computationally expensive and complex, thereby discouraging adversaries.

GARD has fostered a collaborative AI security ecosystem by organizing workshops, conferences, and forums that bring together top researchers, industry professionals, and government representatives. These gatherings have been pivotal for sharing insights, discussing emerging threats, and identifying innovative defense strategies. This spirit of collaboration has strengthened the global AI security landscape, as researchers can build on one another’s work.

Talent Development

To ensure a strong future for AI security, GARD supports talent development through fellowships, internships, and specialized training programs. By equipping the next generation of AI security experts with knowledge and hands-on experience, GARD is building a foundation of skilled professionals dedicated to tackling emerging security challenges and advancing the field.

GARD Awards

The Guaranteeing AI Robustness against Deception (GARD) program, led by DARPA, is structured around three key groups focused on different aspects of adversarial machine learning. The first group investigates the theoretical foundations of adversarial attacks on artificial intelligence, aiming to understand the mechanisms behind vulnerabilities. The second group is dedicated to developing defensive strategies against these attacks, while the third group serves as evaluators, testing the robustness of the defenses every six months by introducing new attack scenarios. This cyclical testing ensures that defenses remain effective and practical in real-world applications.

Intel has been selected to spearhead the physical adversarial attacks component of the GARD project, leveraging its expertise in simulating external environments, particularly in the context of self-driving cars. With Intel’s acquisition of Mobileye, a leader in vehicle computer vision, the company is poised to enhance its focus on preemptively identifying and addressing vulnerabilities. Jason Martin, a senior research scientist at Intel Labs, emphasizes the importance of taking a proactive approach to future threats rather than reacting to current ones. The goal is to develop a comprehensive defense system that transcends traditional rule-based mitigations, which tend to be limited and predefined.

Central to Intel’s contribution is the enhancement of object detection technologies, which are crucial in applications ranging from autonomous vehicles to assistive technologies for visually impaired individuals. By utilizing coherence techniques—spatial, temporal, and semantic coherence—Intel aims to infuse a level of common sense into AI systems that allows them to recognize when something is amiss. For instance, if an object suddenly appears or disappears without a plausible explanation, or if an object is detected in an unusual position, the system will flag these anomalies as potential attacks. This innovative approach not only aims to improve detection accuracy but also to strengthen the overall resilience of AI systems against adversarial manipulations.

As part of GARD’s broader initiative, IBM Research has received a $3.4 million grant to develop open-source tools to support the evaluation of defenses against various adversarial attack scenarios, including those that occur in physical environments. This research will utilize IBM’s Adversarial Robustness Toolbox (ART), which is designed to help researchers defend and verify AI models against adversarial attacks. The importance of identifying vulnerabilities before adversaries do is underscored by the project’s emphasis on advancing theoretical foundations to establish limits of security and potential weaknesses. The goal is to ensure that AI systems can withstand attacks while providing clear guidance on their vulnerabilities.

Overall, while robust machine learning represents a critical layer in the ongoing battle to secure AI systems, experts acknowledge that no singular solution will be foolproof. The emphasis is on creating a layered defense with theoretical guarantees similar to those in cryptography, ultimately making it significantly challenging for attackers to exploit AI vulnerabilities. The GARD program, with its collaborative approach involving organizations like Intel, IBM, and various academic institutions, aims to enhance the understanding and mitigation of AI vulnerabilities, contributing to the long-term security of machine learning technologies.

Reent Developments

The Defense Advanced Research Projects Agency has transitioned newly developed defensive capabilities to the Chief Digital and Artificial Intelligence Office, according to a senior official. The CDAO was formed in 2022 to serve as a hub for accelerating the adoption of artificial intelligence and related tech across the Defense Department. That office is a logical transition partner for DARPA, Turek said. Plans called for transitioning GARD-related capabilities to other Defense Department components in fiscal 2024, when the project is slated to wind down, according to justification books.

“DARPA’s core mission [is to] prevent and create strategic surprise. So the implication is that we’re looking over the horizon for transformative capabilities. So in some sense, we are very early in the research pipeline, typically. Products that come out of those research programs could go a couple places … Transitioning them to CDAO, for instance, might enable broad transition across the entirety of the DOD,” Turek said. “I think having an organization that can provide some shared resources and capabilities across the department [and] can be a resource or place people can go look for help or tools or capabilities — I think that’s really useful. And from a DARPA perspective gives us a natural transition partner.”

Ensuring Ethical and Responsible AI Security

As AI continues to shape industries and transform society, ensuring its safe, ethical, and responsible use has become a priority. DARPA’s GARD program is more than just a technological initiative; it is a proactive effort to build trust in AI by ensuring these systems are safeguarded against deception and manipulation. By advancing technical defenses, fostering community collaboration, and creating a secure foundation for critical applications, GARD is helping to build an AI future that is resilient, dependable, and aligned with public safety and ethical standards.

Conclusion: A Safer Future with Robust Machine Learning

DARPA’s GARD program is leading the charge in addressing one of the most pressing challenges in artificial intelligence: creating models that are resistant to deception. By focusing on building ML systems that are inherently resilient, adaptable, and capable of continuous self-improvement, GARD is laying the groundwork for a safer future where machine learning can be trusted to perform reliably under even the most challenging conditions.

DARPA’s GARD program exemplifies a comprehensive approach to AI security, addressing vulnerabilities and empowering the global AI research community to develop resilient, trustworthy systems. With AI rapidly embedding itself into vital sectors, GARD’s contributions are essential for a secure digital future.

As adversarial tactics continue to evolve, GARD’s innovations will be instrumental in fortifying ML applications across various industries, ensuring these technologies can withstand and repel deception. Ultimately, GARD’s vision is a world where machine learning not only advances but does so with the utmost security and robustness, providing a stable foundation for the critical applications that define our future.

References and Resources also include:

https://www.protocol.com/intel-darpa-adversarial-ai-project