Police agencies are using facial and object recognition technology for counterterrorism operations. Video footage played a key role in finding the culprits responsible for the November 2015 Paris attacks, with a CCTV video at Brussels airport used to pin down one suspect. But, the sheer volume of video content produced makes identifying, assembling and delivering actionable intelligence — from multiple sources and across thousands of hours of footage — a habitually long, laborious process. DoD collects loads of data from satellites, drones and Internet-of-things devices. But it needs help making sense of the intelligence and analyzing it quickly enough so it can be used in combat operations.
Now defense and intelligence agencies are leveraging artificial intelligence (AI) and machine learning to automatically identify video objects of interest. They need powerful artificial intelligence software tools that the tech industry is advancing at a past pace. The U.S. military has already spent $7.4 billion on AI to streamline and speed up video analysis in the conflict against ISIS.
The most promising AI effort the Pentagon has going now is Project Maven which started in July 2017 . Military analysts are using Google-developed AI algorithms to mine live video feeds from drones. The DoD is now developing an AI-driven algorithm to work in conjunction with its drone footage to spot, tag and bookmark potential threat targets. With machine learning techniques, software is taught to find particular objects or individuals at speeds that would be impossible for any human analyst. This AI technology can differentiate between people, objects and buildings, much like Google’s driverless cars. Undersecretary of Defense for Intelligence Joseph Kernan said Project Maven only started a year ago and so far has been “extraordinarily” useful in overseas operations.
Earlier, it came to light that law enforcement agencies had begun testing pilots with Amazon Rekognition, the company’s cloud-based facial recognition technology. According to confidential documents obtained by the Intercept, IBM has developed object recognition technology capable of identifying people by physical characteristics like skin color with help from the New York Police Department.
With secret access to NYPD-captured videos, IBM developed this technology to allow police to search camera footage for persons with a specific hair color, facial hair type, or skin color, the Intercept reports. The development of this technology dates back to 2012, when counterterrorism officials gained access to skin color-searching capabilities. The technology originally focused on object recognition specifically but was later honed to identify persons by age, “head color,” gender, and skin color. Some of the IBM researchers had qualms about the technology if its accuracy improved in the future. NYPD told the Intercept that its collaboration with IBM was about “finding a way to shorten the time to catch the bad people” after a crime.
Speaking at AWS re:Invent 2018 in Las Vegas this week, Deputy Assistant Director Christine Halvorsen explained that the FBI moved its counterterrorism data to the cloud on AWS after the deadly shooting, which “resulted in a 98 percent reduction in manual work for analysts and 70 percent cost reductions,” fedscoop reported. “We had agents and analysts, eight per shift, working 24/7 for three weeks going through the video footage of everywhere Stephen Paddock was the month leading up to him coming and doing the shooting,” said Halvorsen, who today was named “Person of the Year” by Homeland Security Today. “If we had loaded that up into the cloud, the estimate is it would’ve taken us a day using Amazon Rekognition to recognize where he was in the videos. That’s all we were trying to do: narrow down where in the videos he was and who he was meeting with to make sure there wasn’t anybody else part of the conspiracy,” she added.
However, a study by the ACLU found performance flaws in Rekognition leading to its incorrectly matching 28 members of Congress, identifying them as other people who have been arrested for a crime. The members of Congress who were falsely matched with the mugshot database used in the test included Republicans and Democrats, men and women, and legislators of all ages, from all across the country. Nearly 40% of Rekognition’s false matches in the test were of people of color, even though they made up only 20% of Congress.
China’s city surveillance programme has been driven by government policies and initiatives, including the 2005 Skynet Program to strengthen public security by installing cameras in key public areas, the completion of the installation of cameras in all key public places by 2020, upgrading existing cameras to HD resolution and ensuring all video footage from these areas is accessible to the authorities. The Xue Liang program, launched in 2016, aims to connect all cameras installed in villages, towns and districts to a central surveillance platform from county to national level, and to share video across police forces, emergency services and other government agencies. The scale of these surveillance operations means that users need to find ways of interpreting and processing the vast amounts of data produced – hence the drive for deep learning technologies on the part of Chinese manufacturers. Video analytics based on deep learning use a set of algorithms to enable systems to ‘learn’ from examples unsupervised or semi-supervised, and then apply that learning to future scenarios.
Chinese startup SenseTime, which makes AI-powered surveillance software for the country’s police, and which received in April 2018 a new round of funding worth $600 million. This funding, led by retailing giant Alibaba, reportedly gives SenseTime a total valuation of more than $4.5 billion, making it the most valuable AI startup in the world, according to analyst firm CB Insights. Most notably, SenseTime also outfits Chinese law enforcement with facial recognition and tracking services. For example, the company says that software it provides for the security bureau of Guangzhou (one of China’s three biggest cities with a metropolitan population of around 25 million) is used to match surveillance footage from crime scenes to photos from a criminal database, and has identified more than 2,000 suspects and solved “nearly 100 cases.”
Intelligent Video Analytics
As the amount of video data generated tends to be pretty huge, with no way to handle and process all of it in a short span of time using manpower alone due to limitations in human capacity, video analytics is serving as a useful asset to make generated video data more valuable. Automated solutions, delivered by deep learning and artificial intelligence, can efficiently analyze the huge amount of data that videos generate, providing tremendously fast results.
Intelligent video analytics also use deep learning for facial recognition. A well-trained deep learning solution allows video analytics to analyze facial data more quickly by providing more accurate face detection with faster response time, thus creating a powerful method for facial recognition. Deep learning technologies also help analyze and process vast streams of footage.
Using AI in video analytics, a number of systems will be able to communicate with each helping in taking decisions and readily catching suspicious activities or predicting them before they can happen. There are some situations where a camera cannot take action due to some visual obstacle that is not included in camera tampering algorithms which means video analytics will not work. The situation can be beyond the line of sight.
Combining video analytics with other advanced technologies, including Real-time Location Systems (RTLS) or Radio-frequency identification Systems (RFID), can provide the exact data or location.
Facial expressions manifest not only emotions but also allied actions, behavioral patterns and give a lot of useful data when it comes to helping industries like Law enforcement, Forensics etc. Video analytics can be achieved based on data curation, sentiment analysis, and other advanced solutions. Expressions like “happy”, “sad”, “angry”, “scared”, “surprised” or “neutral” form the basis of video analytics.
An advanced video analytics solution may contain multiple functionalities and features including:
- People management: Crowd detection, queue management, people counting, people scattering, people tracking
- Vehicle management: Vehicle classification, traffic monitoring, license plate recognition, road data gathering
- Behavior monitoring: Motion detection, vandalism detection, face detection, privacy masking, suspicious activity detection
- Device protection: Protection against camera tampering, perimeter protection, intrusion detection, theft and threat detection
Google’s artificial intelligence technologies are being used by the US military in Maven project
Project Maven is a fast-moving effort launched in April 2017 by then-Deputy Defense Secretary Bob Work to accelerate the department’s integration of big data, artificial intelligence and machine learning into DoD programs. “As numerous studies have made clear, the department of defense must integrate artificial intelligence and machine learning more effectively across operations to maintain advantages over increasingly capable adversaries and competitors,” Work wrote.
The project’s first task involves developing and integrating computer-vision algorithms needed to help military and civilian analysts encumbered by the sheer volume of full-motion video data that DoD collects every day in support of counterinsurgency and counterterrorism operations. Project Maven focuses on computer vision — an aspect of machine learning and deep learning — that autonomously extracts objects of interest from moving or still imagery, Cukor said. Biologically inspired neural networks are used in this process, and deep learning is defined as applying such neural networks to learning tasks.
People and computers will work symbiotically to increase the ability of weapon systems to detect objects,” Cukor added. “Eventually we hope that one analyst will be able to do twice as much work, potentially three times as much, as they’re doing now. That’s our goal.”
Another reason Project Maven is “disruptive” is that it shows that analysts are beginning to trust new sources of intelligence and nontraditional methods, Manzo said. “What’s encouraging is that the outputs of these systems are being trusted by the users,” he said. “A machine comes up with an answer and the human gives the thumbs up or down,” he said. “If DoD is trusting this, it’s a tremendous step.” Even though a human is supervising, the focus doesn’t have to be on “making sure the machine is doing the things I asked the machine to do.”
He said the immediate focus is 38 classes of objects that represent the kinds of things the department needs to detect, especially in the fight against the Islamic State of Iraq and Syria.
The AWCFT has already delivered the first algorithms to warfighting systems by the end of 2017. The team is already developing proposals to take on the next set of challenging intelligence projects. In Phase II, Project Maven will expand its scope, turning the enormous volume of data available to DoD into actionable intelligence and decision-quality insights at speed.
Defense procurement chief Ellen Lord said the Pentagon will start bringing together AI projects that already exist but do not necessarily share information or resources. “We have talked about taking over 50 programs and loosely associating those,” Lord told reporters. “We have many silos of excellence.” Undersecretary of Defense for Research and Engineering Michael Griffin will oversee a new AI office that will bring in “elements of the intelligence community,” he said. But many details remain to be worked out.
Google’s TensorFlow AI systems are being used by the US Department of Defense’s (DoD) Project Maven, which was established in July last year to use machine learning and artificial intelligence to analyse the vast amount of footage shot by US drones. The initial intention is to have AI analyse the video, detect objects of interest and flag them for a human analyst to review.
Drew Cukor, chief of the DoD’s Algorithmic Warfare Cross-Function Team, said in July: “People and computers will work symbiotically to increase the ability of weapon systems to detect objects. Eventually we hope that one analyst will be able to do twice as much work, potentially three times as much, as they’re doing now. That’s our goal.”
Project Maven forms part of the $7.4bn spent on AI and data processing by the DoD, and has seen the Pentagon partner with various academics and experts in the field of AI and data processing. It has reportedly already been put into use against Islamic State.
A Google spokesperson said: “This specific project is a pilot with the Department of Defense, to provide open source TensorFlow APIs that can assist in object recognition on unclassified data. The technology flags images for human review, and is for non-offensive uses only.”
China dominates Global Intelligent Video Analytics Market growth
Current law enforcement systems are increasingly unable to cope with the sheer volume of surveillance material captured and stored every day. This is only set to rise, with the population of video cameras increasing by at least 12% per year. These video streams will only ever be useful if processes to search and analyze the mountain of data keep pace. As it stands today vital information is missed because the vast majority of the video is simply never viewed.
The researchers have defined AI Video Analytics as a solution that is running deep learning algorithms on a platform that is most likely to be built on a GPU chip architecture. These solutions are very much in their embryonic stage and the researchers best estimate is that global sales of AI Video Analytic solutions in 2017 was only around $115 million, much of this being installed in China on Safe City projects.
China is the largest and fastest growing market for video surveillance – the domestic market is forecast to be worth up to $20bn in 2018. The adoption of video surveillance systems in China is booming, driven by the country’s safe cities initiative (mostly by the public sector) and a burgeoning private sector. There are huge surveillance installations in towns and cities – some complete with face recognition, behaviour analysis and ANPR. In the southwestern city of Guiyang, for example, images of all its 3.5 million residents are held by the authorities, and can be captured and detected on cameras equipped with face recognition. Cameras can also be used to estimate the age, gender and ethnicity of subjects.
The scale of these surveillance operations means that users need to find ways of interpreting and processing the vast amounts of data produced – hence the drive for deep learning technologies on the part of Chinese manufacturers. Video analytics based on deep learning use a set of algorithms to enable systems to ‘learn’ from examples unsupervised or semi-supervised, and then apply that learning to future scenarios. The first installations in China of video surveillance equipment based on deep learning took place in 2016.
This growth is due in large part to major advances in semiconductor architecture, which is enabling much faster processing; Empowering deep learning and machine learning algorithms to analyze data many times faster than was previously possible. Venture capitalists are now pouring billions of dollars into financing Artificial Intelligence (AI) chip and analytic software companies. Indeed while researching this report, the researchers identified 128 companies across the world that are now in some way helping (hardware & software) to deliver AI video analytic solutions.
Nvidia has emerged as the early leader in AI chips and is particularly strong in video analytics. Nvidia’s edge is that its PC gaming processors (GPUs) can be scaled up to handle AI software, thanks to their parallel processing circuitry which can handle complex multiple tasks.
There is still much to be done in perfecting the technology and getting it to market, but these new tools’ have opened up the opportunity to bring AI products to the video analytics market potentially revolutionizing its performance and capability. And if it can deliver, it will further drive demand for intelligent video surveillance, not just for new projects but open up a vast latent potential for retrofitting millions of existing camera installations.