Supercomputers have become essential for National Security, for decoding encrypted messages, simulating complex ballistics models, nuclear weapon detonations and other WMD, developing new kinds of stealth technology, and cyber defence/ attack simulation. Because of the expense, supercomputers are typically used for the most intensive calculations, like predicting climate change, or modeling airflow around new aircraft.
Identifying cybersecurity threats from raw internet data can be like locating a needle in a haystack. The amount of internet traffic data generated in a 48-hour period, for example, is too massive for one or even 100 laptops to process into something digestible for human analysts.
Oak Ridge National Lab’s Summit supercomputer can process more than 122 petaflops – that’s 122 thousand trillion floating point operations per second. China’s Sunway TaihuLight, which held the top spot for the past five years, can do 93.
They are also essential for Cybersecurity, “Being able to process network data in real near time to see where threats are coming from, to see what kinds of connections are being made by malicious nodes on the network, to see the spread of software or malware on those networks, and being able to model and interdict and track the dynamics on the network regarding things that national security agencies are interested in,” Tim Stevens, a teaching fellow in the war studies department at King’s College London says, “those are the realms in which supercomputing has a real future.”
One advantage that supercomputers offer over traditional approaches is that a supercomputer can look at a large volume of data all at once. “It can find those nuanced relationships across systems, across users, across geolocations, that could indicate early warning of a potential breach,” said Anthony Di Bello, senior director of security, discovery, and analytics at OpenText.
Officials at DARPA, the U.S. defense agency sponsored a contest in 2016 where Giant refrigerator-sized supercomputers battled each other in a virtual contest to show that machines can find software vulnerabilities, giving possible glimpse of the future of cybersecurity. The result: the supercomputers time and time again detected simulated flaws in software.
It represents a technological achievement in vulnerability detection, at a time when it can take human researchers on an average a year to find software flaws. The hope is that computers can do a better job and perhaps detect and patch the flaws within months, weeks or even days.
A global enterprise with 200,000 machines could be processing petabytes of data every day, he said. “I’m looking for a needle in a haystack of needs. I need faster computing.” That’s why analysts rely on sampling to search for potential threats, selecting small segments of data to look at in depth, hoping to find suspicious behavior. While this type of sampling may work for some tasks, such as identifying popular IP addresses, it is inadequate for finding subtler threatening trends.
“If you’re trying to detect anomalous behavior, by definition that behavior is rare and unlikely,” says Vijay Gadepally, a senior staff member at the Lincoln Laboratory Supercomputing Center (LLSC). “If you’re sampling, it makes an already rare thing nearly impossible to find.”
It could be at least two or three years before we start seeing real-world uses of supercomputers for cybersecurity, he said. “The big tech giants are more focused on other use cases at this point
Supercomputer Analyzes Web Traffic Across Entire Internet
Using a supercomputing system, MIT researchers have developed a model that captures what web traffic looks like around the world on a given day, which can be used as a measurement tool for internet research and many other applications. Understanding web traffic patterns at such a large scale, the researchers say, is useful for informing internet policy, identifying and preventing outages, defending against cyberattacks, and designing more efficient computing infrastructure.
For their work, the researchers gathered the largest publicly available internet traffic dataset, comprising 50 billion data packets exchanged in different locations across the globe over a period of several years. They ran the data through a novel “neural network” pipeline operating across 10,000 processors of the MIT SuperCloud, a system that combines computing resources from the MIT Lincoln Laboratory and across the Institute. That pipeline automatically trained a model that captures the relationship for all links in the dataset — from common pings to giants like Google and Facebook, to rare links that only briefly connect yet seem to have some impact on web traffic.
The model can take any massive network dataset and generate some statistical measurements about how all connections in the network affect each other. That can be used to reveal insights about peer-to-peer filesharing, nefarious IP addresses and spamming behavior, the distribution of attacks in critical sectors, and traffic bottlenecks to better allocate computing resources and keep data flowing.
Networks are usually studied in the form of graphs, with actors represented by nodes, and links representing connections between the nodes. With internet traffic, the nodes vary in sizes and location. Large supernodes are popular hubs, such as Google or Facebook. Leaf nodes spread out from that supernode and have multiple connections to each other and the supernode. Located outside that “core” of supernodes and leaf nodes are isolated nodes and links, which connect to each other only rarely.
In internet research, experts study anomalies in web traffic that may indicate, for instance, cyber threats. To do so, it helps to first understand what normal traffic looks like. But capturing that has remained challenging. Traditional “traffic-analysis” models can only analyze small samples of data packets exchanged between sources and destinations limited by location. That reduces the model’s accuracy. Capturing the full extent of those graphs is infeasible for traditional models. “You can’t touch that data without access to a supercomputer,” Kepner says.
In partnership with the Widely Integrated Distributed Environment (WIDE) project, founded by several Japanese universities, and the Center for Applied Internet Data Analysis (CAIDA), in California, the MIT researchers captured the world’s largest packet-capture dataset for internet traffic. The anonymized dataset contains nearly 50 billion unique source and destination data points between consumers and various apps and services during random days across various locations over Japan and the U.S., dating back to 2015.
Before they could train any model on that data, they needed to do some extensive preprocessing. To do so, they utilized software they created previously, called Dynamic Distributed Dimensional Data Mode (D4M), which uses some averaging techniques to efficiently compute and sort “hypersparse data” that contains far more empty space than data points. The researchers broke the data into units of about 100,000 packets across 10,000 MIT SuperCloud processors. This generated more compact matrices of billions of rows and columns of interactions between sources and destinations.
But the vast majority of cells in this hypersparse dataset were still empty. To process the matrices, the team ran a neural network on the same 10,000 cores. Behind the scenes, a trial-and-error technique started fitting models to the entirety of the data, creating a probability distribution of potentially accurate models.
Then, it used a modified error-correction technique to further refine the parameters of each model to capture as much data as possible. Traditionally, error-correcting techniques in machine learning will try to reduce the significance of any outlying data in order to make the model fit a normal probability distribution, which makes it more accurate overall. But the researchers used some math tricks to ensure the model still saw all outlying data — such as isolated links — as significant to the overall measurements.
In the end, the neural network essentially generates a simple model, with only two parameters, that describes the internet traffic dataset, “from really popular nodes to isolated nodes, and the complete spectrum of everything in between,” Kepner says.
Beyond the internet, the neural network pipeline can be used to analyze any hypersparse network, such as biological and social networks. “We’ve now given the scientific community a fantastic tool for people who want to build more robust networks or detect anomalies of networks,” Kepner says. “Those anomalies can be just normal behaviors of what users do, or it could be people doing things you don’t want.”
Supercomputers can spot cyber threats
Lincoln Laboratory researchers have developed a technique to compress hours of internet traffic into a bundle that can be analyzed for suspicious behavior.
Gadepally is part of a research team at the laboratory that believes supercomputing can offer a better method — one that grants analysts access to all pertinent data at once — for identifying these subtle trends. In a recently published paper, the team successfully condensed 96 hours of raw, 1-gigabit network link internet traffic data into a query-ready bundle.
They created the bundle by running 30,000 cores of processing (equal to about 1,000 laptops) at the LLSC located in Holyoke, Massachusetts, and it is stored in the MIT SuperCloud, where it can be accessed by anyone with an account.
“[Our research] showed that we could leverage supercomputing resources to bring in a massive quantity of data and put it in a position where a cybersecurity researcher can make use of it,” Gadepally explains.
An example of the type of threatening activity that requires analysts to dig in to such a massive amount of data are instructions from command-and-control (C&C) servers. These servers issue commands to devices infected with malware in order to steal or manipulate data.
Gadepally likens their pattern of behavior to that of spam phone callers: While a normal caller might make and receive an equal number of calls, a spammer would make millions more calls than they receive. It’s the same idea for a C&C server, and this pattern can be found only by looking at lots of data over a long period of time.
“The current industry standard is to use small windows of data, where you toss out 99.99 percent,” Gadepally says. “We were able to keep 100 percent of the data for this analysis.”
The team plans to spread the word about their ability to compress such a large quantity of data and they hope analysts will take advantage of this resource to take the next step in cracking down on threats that have so far been elusive. They are also working on ways to better understand what “normal” internet behavior looks like as a whole, so that threats can be more easily identified.
“Detecting cyber threats can be greatly enhanced by having an accurate model of normal background network traffic,” says Jeremy Kepner, a Lincoln Laboratory fellow at the LLSC who is spearheading this new research. Analysts could compare the internet traffic data they are investigating with these models to bring anomalous behavior to the surface more readily.
“Using our processing pipeline, we are able to develop new techniques for computing these background models,” he says.
As government, business, and personal users increasingly rely on the internet for their daily operations, maintaining cybersecurity will remain an essential task for researchers and the researchers say supercomputing is an untapped resource that can help.
IBM Watson Supercomputer to be used for Cyber Security
IBM has announced that its Watson supercomputer is available for cyber security needs. Therefore IBM’s Watson will be the first supercomputer that combines artificial intelligence and sophisticated analytical software to deter cyber threats. Technically, Watson is designed to power Cognitive Security Operations centers (SOCs) and is being trained on the language of cyber security.
The researchers have been preparing it from past one year to serve the field of cyber security by ingesting it with over 1 million security documents. Hence, IBM Watson can help security analysts parse thousands of natural language research reports that have never before been accessible to present day’s security tools.
Usually, security analysts sift through 200,000 security events per day on an average, leading to the wastage of over 20k hours per year due to the chase of false positives. In coming years, the anticipated events will double and may also triple on a global note.
IBM researchers want to ease the life of security analysts with the help of Watson. This can be achieved by integrating Watson into IBM’s new Cognitive SOC Platform which will bring advanced cognitive technologies close to security operations giving wider scope to respond to threats across endpoints, networks, users, and cloud.
The highlight of Watson’s platform will be IBM QRadar Advisor, which is the first tool that taps into Watson’s corpus of cyber security insights. The new App is already being beta tested by the researchers of the University of New Brunswick.
David Shipley, the Director of Strategic initiatives for information technology services of New Brunswick University said that his team seeks advice from Watson on 10 to 15 cyber threats each day.
California State Polytechnic, Sun Life Financial, and the University of Rochester Medical Centers are also testing the application.
In order to extend the ability of Cognitive SOC to endpoints, IBM Security is also announcing a new Endpoint Detection and Response (EDR) solution called IBM BigFix Detect.
BigFix Detect helps organizations gain a full visibility into the evolution of threat landscape. It will also help in bridging the gap between malicious behavior detection and remediation.