DARPA L2M program exploring new approaches to enable lifelong learning for AI systems and making them smarter, safer, and more reliable

Rajesh Uppal September 12, 2019 AI & IT, Industry & Market Dynamics Comments Off on DARPA L2M program exploring new approaches to enable lifelong learning for AI systems and making them smarter, safer, and more reliable 804 Views

Machine learning (ML) methods have demonstrated outstanding recent progress and, as a result, artificial intelligence (AI) systems can now be found in myriad applications, including autonomous vehicles, industrial applications, search engines, computer gaming, health record automation, and big data analysis.

However, Current ML systems experiencing errors when they encounter circumstances outside their programming and/or training must be taken off-line and re-programmed/retrained. Taking a system offline and re-training it is expensive and time-consuming, not to mention that encountering a programming/training oversight during execution time can be disruptive to a mission.

Current ML systems are also plagued with another significant problem known as catastrophic forgetting. These systems ‘forget’ previously incorporated data when trained with new data and unless programmed or trained for every eventuality, these systems operating in real-world environments are bound to fail at some point. This means ML is restricted to specific situations with narrowly predefined rule sets.

At the same time, current ML systems are not intelligent in the biological sense. They have no ability to adapt their methods beyond what they were prepared for in advance and are completely incapable of recognizing or reacting to any element, situation or circumstance they have not been specifically programmed or trained for.

This issue presents severe limitations in system capability, creates potential safety issues, and is clearly limiting in Department of Defense (DoD) applications, e.g., supply chain, logistics, and visual recognition, where complete details are often unknown in advance and the ability to react quickly and adapt to dynamic circumstances is of primary importance.

The L2M program, initially announced in 2017, is delving into research and development of next-generation AI systems and their components, together with learning mechanisms in biological organisms capable of translation into computational processes. The goal of the Lifelong Learning Machines (L2M) program is to develop substantially more capable systems that are continually improving and updating from experience. Proposed research should investigate innovative approaches that support key lifelong learning machines technologies and enable revolutionary advances in the science of adaptive and intelligent systems.

The L2M effort currently encompasses a large base of 30 performer groups via grants and contracts of different duration and size.

“We are on the threshold of a major jump in AI technology,” stated Siegelmann. “The L2M program will require significantly more ingenuity and effort than incremental changes to current systems. L2M seeks to enable AI systems to learn from experience and become smarter, safer, and more reliable than existing AI.”

DARPA’s Lifelong Learning Machines (L2M) program

DARPA solicited highly innovative research proposals for the development of fundamentally new machine learning approaches that enable systems to learn continually as they operate and apply previous knowledge to novel situations. The goal of the DARPA’s L2M program is to develop fundamentally new machine learning mechanisms that enable systems to learn continuously during execution and apply previously learned information to novel situations the way biological systems do and in which the environment is, in effect, the training set. Such a system is safer, more functional, and increasingly relevant to DoD applications, including adapting quickly to unforeseen circumstances, changing the mission, and improving performance through a system’s fielded lifetime experience.

The L2M program considers inspiration from biological adaptive mechanisms as a supporting pillar of the project. Biological systems exhibit an impressive capacity to learn and adapt their structure and function throughout their lifespan, while retaining stability of core functions. Taking advantage of adaptive mechanisms evolved through billions of years honing highly robust tissue-mediated computation will provide unique insights for building L2M solutions.

While it is very easy to code agent behavior to perform a particular task, doing so precludes the agent learning the task, which in turn precludes the possibility of adapting the behavior to another task or situation. This is the heart of the problem to be solved in the creation of a lifelong learning machine. The purpose and goals of the L2M program, is developing a system that figures out how to accomplish a task and subsequently can figure out another task more easily based on previous learning.

A possible realization of an L2M system is a plastic nodal network (PNN) – as opposed to a fixed, homogeneous neural network. While plastic, the PNN must incorporate hard rules governing its operation, maintaining equilibrium. If rules hold the PNN too strongly, it will not be plastic enough to learn, yet without some structure the PNN will not be able to operate at all.

Eight computer science professors in Oregon State University’s College of Engineering have received a $6.5 million grant from the Defense Advanced Research Projects Agency to make artificial-intelligence-based systems like autonomous vehicles and robots more trustworthy.

First announced in 2017, DARPA’s L2M program has selected the research teams who will work under its two technical areas. The first technical area focuses on the development of complete systems and their components, and the second will explore learning mechanisms in biological organisms with the goal of translating them into computational processes. Discoveries in both technical areas are expected to generate new methodologies that will allow AI systems to learn and improve during tasks, apply previous skills and knowledge to new situations, incorporate innate system limits, and enhance safety in automated assignments.

The L2M research teams are now focusing their diverse expertise on understanding how a computational system can adapt to new circumstances in real time and without losing its previous knowledge. A team led by the University of California at Irvine will study the “dual-memory architecture” of the human hippocampus, thought to be the center of memory and the autonomic nervous system, along with the cerebral cortex. DARPA said the team will attempt to create a machine learning system “capable of predicting potential outcomes by comparing inputs to existing memories, which should allow the system to become more adaptable while retaining previous learnings.”

A second team lead by Tufts University researchers will look to the animal world for analogs that could be applied to adaptive learning. Among the approaches will be examining regenerative mechanisms observed in amphibians like salamanders to create “flexible robots” capable of altering the structure and functions in response to changes in their environment.

Adapting methods from biological memory reconsolidation, a team from University of Wyoming will work on developing a computational system that uses context to identify appropriate modular memories that can be reassembled with new sensory input to rapidly form behaviors to suit novel circumstances.The goal is to “rapidly form behaviors to suit novel circumstances,” DARPA said. The research also would draw on adaptation techniques associated with memory reconsolidation, a biological process in which memories are formed, stored and used for tasks like problem solving.

“With the L2M program, we are not looking for incremental improvements in state-of-the-art AI and neural networks, but rather paradigm-changing approaches to machine learning that will enable systems to continuously improve based on experience,” said Dr. Hava Siegelmann, the program manager leading L2M. “Teams selected to take on this novel research are comprised of a cross-section of some of the world’s top researchers in a variety of scientific disciplines, and their approaches are equally diverse.”

While still in its early stages, the L2M program has already seen results from a team led by Dr. Hod Lipson at Columbia University’s Engineering School. Dr. Lipson and his team recently identified and solved challenges associated with building and training a self-reproducing neural network, publishing their findings in Arvix Sanity. While neural networks are trainable to produce almost any kind of pattern, training a network to reproduce its own structure is paradoxically difficult. As the network learns, it changes, and therefore the goal continuously shifts. The continued efforts of the team will focus on developing a system that can adapt and improve by using knowledge of its own structure. “The research team’s work with self-replicating neural networks is just one of many possible approaches that will lead to breakthroughs in lifelong learning,” said Siegelmann.

A generative memory approach to enable lifelong reinforcement learning

A key limitation of existing artificial intelligence (AI) systems is that they are unable to tackle tasks for which they have not been trained. In fact, even when they are retrained, the majority these systems are prone to ‘catastrophic forgetting,’ which essentially means that a new item can disrupt their previously acquired knowledge.

For instance, if a model is initially trained to complete task A and then subsequently retrained on task B, its performance on task A could decline considerably. A naïve solution would be to infinitely add more neural layers to support additional tasks or items being trained, but such an approach would not be efficient, or even functionally scalable.

Researchers at SRI international have recently tried to apply biological memory transfer mechanisms to AI systems, as they believe that this could enhance their performance and make them more adaptive. Their study, pre-published on arXiv, draws inspiration from mechanisms of memory transfer in humans, such as long-term and short-term memory.

“We are building a new generation of AI systems that can learn from experiences,” Sek Chai, a co-PI of the DARPA Lifelong Learning Machines (L2M) project, told TechXplore. “This means that they can adapt to new scenarios based their experiences. Today, AI systems fail because they are not adaptive. The DARPA L2M project, led by Dr. Hava Siegelmann, seeks to achieve paradigm-changing advances in AI capabilities.”

Memory transfer entails a complex sequence of dynamic processes, which allow humans to easily access salient or relevant memories when thinking, planning, creating or make predictions about future events. Sleep is thought to play a critical role in the consolidation of memories, particularly REM sleep, the stage in which dreaming occurs most frequently.

In their study, Chai and his SRI colleagues developed a generative memory mechanism that can be used to train AI systems in a pseudo-rehearsal manner. Using replay and reinforcement learning (RL), this mechanism allows AI systems to learn from salient memories throughout their lifetime, and scale with a large number of training tasks or items. The generative memory approach developed by Chai and his colleagues uses an encoding method to separate the latent space. This allows an AI system to learn even when tasks are not well-defined or when the number of tasks is unknown.

“Our AI system does not directly store raw data, such as video, audio, etc.,” Chai explained. “Rather, we use generative memory to generate or imagine what it has experienced previously. Generative AI systems have been used to create art, music, etc. In our research, we use them to encode generative experiences that can be used later with reinforcement learning. Such an approach is inspired by biological mechanisms in sleep and dreams, where we recall or imagine fragments of experiences that are reinforced in our long-term memories.”

In the future, the new generative memory approach introduced by Chai and his colleagues could help to address the issue of catastrophic forgetting in neural network-based models, enabling lifelong learning in AI systems. The researchers are now testing their approach on computer-based strategy games that are commonly employed to train and evaluate AI systems.”We are using real-time strategy games such as StarCraft2 to train and study our AI agents on lifelong learning metrics such as adaptation, robustness, and safety,” Chai said. “Our AI agents are trained with surprises injected into the game (e.g. terrain and unit capability change).”

A Robotic Leg, Born Without Prior Knowledge, Learns to Walk

Francisco J. Valero-Cuevas, a professor of Biomedical Engineering and professor of Biokinesiology & Physical Therapy at USC, in a project with USC Viterbi School of Engineering doctoral student Ali Marjaninejad and two other doctoral students—Darío Urbina-Meléndez and Brian Cohn, has developed a bio-inspired algorithm that can learn a new walking task by itself after only 5 minutes of unstructured play, and then adapt to other tasks without any additional programming.

Their article, outlined in the March cover article of Nature Machine Intelligence, opens exciting possibilities for understanding human movement and disability, creating responsive prosthetics, and robots that can interact with complex and changing environments like space exploration and search-and-rescue.

“Nowadays, it takes the equivalent of months or years of training for a robot to be ready to interact with the world, but we want to achieve the quick learning and adaptations seen in nature,” said senior author Valero-Cuevas, who also has appointments in computer science, electrical and computer engineering, aerospace and mechanical engineering and neuroscience at USC.

Marjaninejad, a doctoral candidate in the Department of Biomedical Engineering at USC, and the paper’s lead author, said this breakthrough is akin to the natural learning that happens in babies. Marjaninejad explains, the robot was first allowed to understand its environment in a process of free play (or what is known as ‘motor babbling’).“These random movements of the leg allow the robot to build an internal map of its limb and its interactions with the environment,” said Marjaninejad.The paper’s authors say that, unlike most current work, their robots learn-by-doing, and without any prior or parallel computer simulations to guide learning.

Marjaninejad also added this is particularly important because programmers can predict and code for multiple scenarios, but not for every possible scenario—thus pre-programmed robots are inevitably prone to failure.“However, if you let these [new] robots learn from relevant experience, then they will eventually find a solution that, once found, will be put to use and adapted as needed. The solution may not be perfect, but will be adopted if it is good enough for the situation. Not every one of us needs or wants—or is able to spend the time and effort— to win an Olympic medal,” Marjaninejad said.

Through this process of discovering their body and environment, the robot limbs designed at Valero-Cuevas’ lab at USC use their unique experience to develop the gait pattern that works well enough for them, producing robots with personalized movements. “You can recognize someone coming down the hall because they have a particular footfall,” Valero-Cuevas said. “Our robot uses its limited experience to find a solution to a problem that then becomes its personalized habit, or ‘personality’—We get the dainty walker, the lazy walker, the champ… you name it.”

The potential applications for the technology are many, particularly in assistive technology, where robotic limbs and exoskeletons that are intuitive and responsive to a user’s personal needs would be invaluable to those who have lost the use of their limbs. “Exoskeletons or assistive devices will need to naturally interpret your movements to accommodate what you need,” Valero-Cuevas said.

“Because our robots can learn habits, they can learn your habits, and mimic your movement style for the tasks you need in everyday life—even as you learn a new task, or grow stronger or weaker.”According to the authors, the research will also have strong applications in the fields of space exploration and rescue missions, allowing for robots that do what needs to be done without being escorted or supervised as they venture into a new planet, or uncertain and dangerous terrain in the wake of natural disasters. These robots would be able to adapt to low or high gravity, loose rocks one day and mud after it rains, for example.

The paper’s two additional authors, doctoral students Brian Cohn and Darío Urbina-Meléndez weighed in on the research:“The ability for a species to learn and adapt their movements as their bodies and environments change has been a powerful driver of evolution from the start,” said Cohn, a doctoral candidate in computer science at the USC Viterbi School of Engineering. “Our work constitutes a step towards empowering robots to learn and adapt from each experience, just as animals do.”

“I envision muscle-driven robots, capable of mastering what an animal takes months to learn, in just a few minutes,” said Urbina-Meléndez, a doctoral candidate in biomedical engineering who believes in the capacity for robotics to take bold inspiration from life. “Our work combining engineering, AI, anatomy and neuroscience is a strong indication that this is possible.”

This research was funded in part by the National Institutes of Health, the Department of Defense’s CDMRP program, and DARPA’s Lifelong Learning Machines (L2M) program. Discover more about the project.

References and Resources also include:

https://viterbischool.usc.edu/news/2019/03/a-robotic-leg-born-without-prior-knowledge-learns-to-walk/

https://techxplore.com/news/2019-03-memory-approach-enable-lifelong.html

DARPA L2M program exploring new approaches to enable lifelong learning for AI systems and making them smarter, safer, and more reliable

Related Articles

DARPA’s Lifelong Learning Machines (L2M) program

Researchers Selected to Develop Novel Approaches to Lifelong Machine Learning

A generative memory approach to enable lifelong reinforcement learning

A Robotic Leg, Born Without Prior Knowledge, Learns to Walk

References and Resources also include:

About Rajesh Uppal

Check Also

IDST News Archives