Scientific models – or conceptual representations of complex systems – are used by myriad communities to understand and explain the world around us. Computationally creating these models is a largely manual, cumbersome task that requires scouring mountains of research for relevant content, and then executing multi-step processes to build, validate, and test the resulting model. The challenges to model creation are compounded by the many opportunities to lose information at each step of the process, or for other errors to occur.
In August 2018, DARPA released the Automating Scientific Knowledge Extraction (ASKE) program aimed to develop technology to automate some of the manual processes of scientific knowledge discovery, curation and application. The goal of the ASKE project was to develop AI technologies capable of automating some of the manual processes of scientific knowledge discovery, curation, and application. It identified how and where AI could accelerate the process of scientific modeling, and ultimately improve researchers’ ability to conduct rigorous and timely experimentation and validation.
ASKE is part of DARPA’s Artificial Intelligence Exploration (AIE) program, a key component of the agency’s broader AI investment strategy aimed at ensuring the United States maintains an advantage in this critical and rapidly accelerating technology area. AIE is a key component of a broader agency-wide investment strategy in AI called the AI Next Campaign. AI Next is aimed at ensuring the United States maintains an advantage in this critical and rapidly accelerating technology area.
Unlike DARPA’s typical four-year programs, AIEs are designed to be fast-tracked (~18 months in duration) research efforts that help determine the feasibility of an AI concept. To date, DARPA has launched nearly 30 AIEs that are exploring a wide range of AI and machine learning-relevant research topics – from advancing AI/game theory techniques to developing novel AI processing architectures and innovative photonics hardware capable of reducing hardware complexity. To learn more, please visit the AI Next Campaign page.
ASKE seeks to develop approaches to make it easier for scientists to build, maintain and reason over rich models of complex systems – which could include physical, biological, social, engineered or hybrid systems – by interpreting and exposing scientific knowledge and assumptions in existing model code and documentation, identifying new data and information resources automatically, extracting useful information from these sources, integrating this useful information into machine-curated expert models, and executing these models in robust ways.
ASKE’s goal was to address these challenges by developing approaches to locating new data and scientific resources, comb them for useful information, compare those findings with existing research, and then integrate the relevant data into machine-curated expert models and execute them in robust ways. The project’s research efforts were split across two technical areas. One focused on machine-assisted curation, where researchers explored ways to use AI to extract useful information from research and build it into new models. The second area focused on machine-assisted inference, where AI uses those newly developed models to help researchers understand the modeled system, answer complex questions, or make predictions.
Leveraging streamlined contracting procedures and funding mechanisms, DARPA was able to get researchers on board within three months of the initial opportunity announcement. Things kicked-off quickly, and the ASKE teams began developing a number of novel approaches. Researchers from academic institutions and commercial companies devised ways to automate the extraction of knowledge and information from existing models (including across diverse data types such as written text, equations, and software code), and created technologies to query and link information across literature. They created ways to universally represent and explain different modeling frameworks, while also developing tools that allow computational models to be automatically maintained and/or updated as new discoveries and information becomes available. “The ASKE AIE demonstrated a 50x speedup, extending an existing epidemiology model with additional dimensions and states when compared to state-of-the-art manual processes,” said Joshua Elliott, the DARPA program manager that led ASKE. “Using the same tools, it also showed that a new computational model in a different domain could be created 8x faster with ASKE than with current procedures.”
When the COVID-19 pandemic hit, researchers had an opportunity to test their developments and demonstrate their effectiveness. Scientists, researchers, and medical experts the world over generated hundreds of models to help understand and predict various aspects of the virus’ spread and impact, creating a proliferation of scientific knowledge that was difficult to compare, verify, and validate. Much of the knowledge was locked in code, especially legacy code, which makes it harder to understand the parameter choices and assumptions made in the model. As new information and insights became available, an already challenging situation was exacerbated by the difficulties of extracting and representing these new findings, and modeling their effect on the evolving COVID-19 knowledgebase. Authorities rely on expert generated insights to create public policy interventions, making the quality and verification of these models exceedingly important.
ASKE researchers sought to enable better model understanding, inter-comparison, and contextualization by applying their tools to assess, compare, refine, and validate models extracted from code and documentation. The developed tools were used to rapidly auto-extract multi-modal information from publications and assimilate mechanistic fragments to knowledge graphs and executable models to inform the community’s understanding of the virus and possible treatments.
Working in partnership with the scientific community and government agencies, the ASKE tools proved effective across multiple domains. As one example, researchers from Galois worked with government officials and applied their ASKE component to contextualize and adapt national epidemiology models and data to local conditions. Their efforts enabled local planners to evaluate which models provided the best estimation of the impact of the pandemic on vital hospital resources, such as ICU beds and ventilators, using local demographic, geographic, and social behavioral data. In another effort, researchers from Harvard Medical School partnered with the scientific community to use ASKE tools to produce candidate drug lists and gene targets for wet lab experiments. The work resulted in positive early results for Vitamin-D and MDL-28170. Other experiments with these tools are currently in progress.
Automating Scientific Knowledge Extraction and Modeling (ASKEM) program
The success of ASKE has led to the creation of a larger DARPA program called the Automating Scientific Knowledge Extraction and Modeling (ASKEM) program. ASKEM aims to create a knowledge-modeling-simulation ecosystem, empowered with the AI approaches and tools needed for the agile creation, sustainment, and enhancement of the complex models and simulators necessary to support expert knowledge- and data-informed decision making in diverse missions and scientific domains. The goal is to enable experts to maintain, reuse, and adapt large collections of heterogeneous data, knowledge, and models – with traceability across knowledge sources, model assumptions, and model fitness.
The ASKEM-developed tools will be demonstrated in several scientific domains, building on ASKE’s work with viral epidemics such as COVID-19, as well as the physics and impacts of space weather.
“With ASKEM, we hope to enable expert modeling to adapt at the pace of the modern world, allowing decision-makers to get in front of disasters, global changes, and our adversaries in order to avoid damages and improve the timeliness and effectiveness of our responses,” said Elliott. “ASKE produced novel proofs of concept that enable users to extract models from legacy code, and to more efficiently update, enhance, and validate models. ASKEM will expand on this foundation to produce modeling tools that apply to multiple scientific domains and produce a complete, sustainable infrastructure.” Interested proposers can learn more about the ASKEM program during a Proposers Day that DARPA will host on December 8, 2021, from 10:00 AM to 2:00 PM Eastern Time (ET) via Zoom. Advance registration is required to attend. More information is available on sam.gov.
ASKE demonstrated the potential for DARPA’s AI Exploration program and its nimble approach to exploring new AI concepts. This fast-tracked AI research effort helped jump-start a long-term program that requires additional resources and expertise – but could conceivably generate significant benefits to both defense and commercial domains. Conversely, AIEs also allow DARPA to understand where additional advancement might be needed in other fields or areas before committing to a larger undertaking.
Canadian company Uncharted Software has obtained a $19.3 million contract from DARPA to conduct research under the Automating Scientific Knowledge Extraction and Modeling (ASKEM) programme. ASKEM will create a knowledge-modeling simulation ecosystem ‘empowered with the artificial intelligence approaches and tools needed to support expert knowledge- and data-informed decision making in diverse missions and scientific domains’, the DoD explained in a July 2022 announcement.
Work will be performed in Toronto and Washington DC, with an expected completion date of January 2026.
‘This contract was a competitive acquisition under an open Broad Agency Announcement and 29 offers were received,’ the DoD noted.