In recent years consumer imaging technology (digital cameras, mobile phones, etc.) has become ubiquitous, allowing people the world over to take and share images and video instantaneously. Mirroring this rise in digital imagery is the associated ability for even relatively unskilled users to manipulate and distort the message of the visual media. While many manipulations are benign, performed for fun or for artistic value, others are for adversarial purposes, such as propaganda or misinformation campaigns.
This manipulation of visual media is enabled by the wide-scale availability of sophisticated image and video editing applications that permit editing in ways that are very difficult to detect either visually or with current image analysis and visual media forensics tools. The problem isn’t limited to the fashion and cosmetics industries where photos are “touched-up” and “augmented” to make models look better and the results of skin-care products look (instantly) appealing — it’s spread to politics and now even business.
The most infamous form of this kind of content is the category called “deepfakes” — usually pornographic video that superimposes a celebrity or public figure’s likeness into a compromising scene. There are increasing instances of social media being abused and used to abuse the elections. Where things get especially scary is the prospect of malicious actors combining different forms of fake content into a seamless platform,” Andrew Grotto at the Center for International Security at Stanford University said. “Researchers can already produce convincing fake videos, generate persuasively realistic text, and deploy chatbots to interact with people. Imagine the potential persuasive impact on vulnerable people that integrating these technologies could have: an interactive deepfake of an influential person engaged in AI-directed propaganda on a bot-to-person basis.”
False news stories and so-called deepfakes are increasingly sophisticated and making it more difficult for data-driven software to spot. Though software that makes deepfakes possible is inexpensive and easy to use, existing video analysis tools aren’t yet up to the task of identifying what’s real and what’s been cooked up. The media and internet landscapes have seen manipulated videos, audio, images, and stories that spread disinformation, and the Defense Advanced Research Projects Agency (DARPA) is seeking solutions to help it detect and combat the manipulation.
However, existing automated media generation and manipulation algorithms are heavily reliant on purely data driven approaches and are prone to making semantic errors. For example, GAN-generated faces may have semantic inconsistencies such as mismatched earrings. These semantic failures provide an opportunity for defenders to gain an asymmetric advantage. A comprehensive suite of semantic inconsistency detectors would dramatically increase the burden on media falsifiers, requiring the creators of falsified media to get every semantic detail correct, while defenders only need to find one, or a very few, inconsistencies.
DARPA launched Semantic Forensics (SemaFor) program with aim to develop technologies to automatically detect, attribute, and characterize falsified multi-modal media assets (text, audio, image, video) to defend against large-scale, automated disinformation attacks.
The SemaFor program, DARPA says, will explore ways to get around some of the weaknesses of current deepfake detection tools. Statistical detection techniques have been successful, but media generation and manipulation technology is advancing rapidly. Purely statistical detection methods are quickly becoming insufficient for detecting falsified media assets. Detection techniques that rely on statistical fingerprints can often be fooled with limited additional resources (algorithm development, data, or compute). Current media manipulation tools rely heavily on ingesting and processing large amounts of data, DARPA said, making them more prone to errors that can be spotted with the right algorithm.
Syracuse University assistant professor of communications Jennifer Grygiel said in a telephone interview. “It’s really interesting that Darpa is trying to create these detection systems but good luck is what I say. It won’t be anywhere near perfect until there is legislative oversight. There’s a huge gap and that’s a concern.”
SemaFor seeks to develop innovative semantic technologies for analyzing media. Semantic detection algorithms will determine if media is generated or manipulated. Attribution algorithms will infer if media originates from a particular organization or individual. Characterization algorithms will reason about whether media was generated or manipulated for malicious purposes. These SemaFor technologies will help identify, deter, and understand adversary disinformation campaigns. “A comprehensive suite of semantic inconsistency detectors would dramatically increase the burden on media falsifiers, requiring the creators of falsified media to get every semantic detail correct, while defenders only need to find one, or a very few, inconsistencies,” DARPA said.
“From a defense standpoint, SemaFor is focused on exploiting a critical weakness in automated media generators,” said Dr. Matt Turek, the DARPA program manager leading SemaFor. “Currently, it is very difficult for an automated generation algorithm to get all of the semantics correct. Ensuring everything aligns from the text of a news story, to the accompanying image, to the elements within the image itself is a very tall order. Through this program we aim to explore the failure modes where current techniques for synthesizing media break down.”
“A key goal of the program is to establish an open, standards-based, multisource, plug-and-play architecture that allows for interoperability and integration,” DARPA said. “This goal includes the ability to easily add, remove, substitute, and modify software and hardware components in order to facilitate rapid innovation by future developers and users.”
In a broad agency announcement for the Semantic Forensics or SemaFor program announced in August 2019, DARPA said it is looking to focus on the small but common errors produced by automated systems that manipulate media content. For example, images of a woman’s face created with generative adversarial networks, which use a database of real photographs to produce a synthetic face, might include mismatched earrings – a semantic error easier to spot than to avoid making.
DARPA is soliciting innovative research proposals in the area of semantic technologies to automatically assess falsified media. Proposed research should investigate innovative approaches that enable revolutionary advances in science, devices, or systems. Specifically excluded is research that primarily results in evolutionary improvements to the existing state of practice.
This announcement isn’t DARPA’s first stab at the deepfake challenge. The agency has had a Media Forensics (MediFor) team doing this kind of work since 2016. Its goal, the team’s webpage states, is “to level the digital imagery playing field, which currently favors the manipulator, by developing technologies for the automated assessment of the integrity of an image or video and integrating these in an end-to-end media forensics platform.”
The project is split up into four technical areas: detection; attribution and characterization; explanation and integration; and evaluation and challenge curation. DARPA said it wants to make sure any algorithm developed from the project will outperform comparable manual processes and also be able to demonstrate how it reached its conclusions.
TA1 performers will deliver algorithms that detect, characterize, and attribute falsified multi-modal media”;
What’s been made is now applied into practice, at least in part – “The TA2 performer will work with the TA1 performers to integrate algorithms, knowledge, and resources from TA1 performers into the TA2 system. The gathered information will then be reviewed by an analyst, it will not simply be automated. TA2 will also deliver periodic proof-of-concept systems that integrate multiple TA1 components into a SemaFor system targeting scalable cloud deployment. TA2 will be expected to provide a demonstration to the government in each program phase of the progressing capabilities of the SemaFor system.”
Initially, the description of the program suggests that this will curate and look for fake news and disinformation. This is where it is actually admitted that media content will also be created – “TA3 will design, organize, plan, and conduct the SemaFor evaluations and results analysis. While the evaluations will be conducted at the TA2 performer’s location, they will be under the control and supervision of the TA3 performer. TA3 deliverables include program metrics, evaluation protocols, and a library of multi-modal media assets for development and test purposes. Media may be collected or created.” This points to the ambition of DARPA to develop automatic disinformation campaign tools for US DOD.
That why the agency also said it wants to keep a tight lid on some of the technical details of the project, saying it will treat program activities as controlled technical information (CTI). That means that even though such details are not classified, contractors would be barred from sharing or releasing information to other parties since it could “reveal sensitive or even classified capabilities and/or vulnerabilities of operating systems.”
The base algorithm itself will not be categorized as CTI, as DARPA said it will “constitute advances to the state of the art” and would only potentially fall under the definition after it had been trained for a specific Defense Department or governmental purpose.
Notably, in TA4, DARPA plans to start anticipating the “threats” and start proactively fighting them. “TA4 will curate state-of-the-art (SOTA) challenges drawn from the public domain to ensure that the SemaFor program addresses relevant threat scenarios. TA4 will also develop threat models, based on current and anticipated technology, to help ensure that SemaFor defenses will be highly relevant for the foreseeable future. TA4 will include multiple challenge problem curation teams who will collaborate to maximize coverage of the challenge space and threat models. TA4 will regularly deliver challenges and updated threat models to the TA3 evaluation team and DARPA.”
“A key goal of the program is to establish an open, standards-based, multisource, plug-and-play architecture that allows for interoperability and integration,” the announcement stated. “This goal includes the ability to easily add, remove, substitute, and modify software and hardware components in order to facilitate rapid innovation by future developers and users.”
DARPA said it will evaluate SemaFor’s performance based on how well algorithms can conduct three main tasks – detection, attribution, and characterization – as well as how they can perform in other technical areas, such as explanation and integration, evaluation of program metrics, and curation to help develop forward-looking threat models and anticipate future threats.
PAR Government Systems Corp., Rome, New York, was awarded an $11,920,160 cost-plus-fixed-fee contract for a research project under the Semantic Forensics (SemaFor) program. The SemaFor program will develop methods that exploit semantic inconsistencies in falsified media to perform tasks across media modalities and at scale. Work will be performed in Rome, New York, with an expected completion date of June 2024. Fiscal 2020 research, development, test and evaluation funding in the amount of $1,500,000 are being obligated at time of award. This contract was a competitive acquisition under a full and open broad agency announcement and 37 proposals were received.
In March 2021, DARPA announced the research teams selected to take on SemaFor’s research objectives. Teams from commercial companies and academic institutions will work to develop a suite of semantic analysis tools capable of automating the identification of falsified media. Arming human analysts with these technologies should make it difficult for manipulators to pass altered media as authentic or truthful.
Four teams of researchers will focus on developing three specific types of algorithms: semantic detection, attribution, and characterization algorithms. These will help analysts understand the “what,” “who,” “why,” and “how” behind the manipulations as they filter and prioritize media for review.
The teams will be led by Kitware, Inc., Purdue University, SRI International, and the University of California, Berkeley. Leveraging some of the research from another DARPA program – the Media Forensics (MediFor) program – the semantic detection algorithms will seek to determine whether a media asset has been generated or manipulated. Attribution algorithms will aim to automate the analysis of whether media comes from where it claims to originate, and characterization algorithms seek to uncover the intent behind the content’s falsification.
To help provide an understandable explanation to analysts responsible for reviewing potentially manipulated media assets, SemaFor also is developing technologies for automatically assembling and curating the evidence provided by the detection, attribution, and characterization algorithms. Lockheed Martin – Advanced Technology Laboratories will lead the research team selected to take on the development of these technologies and will develop a prototype SemaFor system.
“When used in combination, the target technologies will help automate the detection of inconsistencies across multimodal media assets. Imagine a news article with embedded images and an accompanying video that depicts a protest. Are you able to confirm elements of the scene location from cues within the image? Does the text appropriately characterize the mood of protestors, in alignment with the supporting visuals? On SemaFor, we are striving to make it easier for human analysts to answer these and similar questions, helping to more rapidly determine whether media has been maliciously falsified,” Turek said.
To ensure the capabilities are advancing in line with – or ahead of – the potential threats and applications of altered media, research teams are also working to characterize the threat landscape and devise challenge problems that are informed by what an adversary might do. The teams will be led by Accenture Federal Services (AFS), Google/Carahsoft, New York University (NYU), NVIDIA, and Systems & Technology Research.
Google/Carahsoft will provide perspective on disinformation threats to large-scale internet platforms, while NVIDIA will provide media generation algorithms and insights into the potential impact of upcoming hardware acceleration technologies. NYU provides a link to the NYC Media Lab and a broad media ecosystem that will provide insights into the evolving media landscape, and how it could be exploited by malicious manipulators. In addition, AFS provides evaluation, connectivity, and operational viability assessment of SemaFor in application to the Department of State’s Global Engagement Center, which has taken the lead on combating overseas disinformation.
Finally, ensuring the tools and algorithms in development have ample and relevant training data, researchers from PAR Government Systems have been selected to lead data curation and evaluation efforts on the program. The PAR team will be responsible for carrying out regular, large scale evaluations that will measure the performance of the capabilities developed on the program.