The Defense Advanced Research Projects Agency (DARPA) is known for pushing the boundaries of technology, and its latest initiative, the MOCHA (Machine-learning Optimized Compiler for Heterogeneous Architectures) program, is no exception. Aimed at revolutionizing the way compilers are designed and used, MOCHA seeks to address the growing complexity of modern computing environments, which increasingly rely on a diverse array of specialized processors and accelerators. This article delves into the technical aspects of the MOCHA program, exploring its goals, structure, and the groundbreaking research it aims to foster.
Introduction
For decades, advancements in microprocessor technology were synonymous with increases in clock speeds. However, as this scaling reached its limits in the mid-2000s, the industry shifted focus to architectural innovations like multithreading. While these approaches have extended performance gains, they have also introduced new challenges, particularly as modern systems incorporate a variety of specialized co-processors and accelerators designed for specific tasks.
Traditional compilers are not designed to generate efficient machine code for such heterogenous ensembles of Central Processing Units (CPUs), Graphics Processing Units (GPUs), and other accelerators. Instead, software developers write unique code and libraries to take advantage of specialized hardware, reducing productivity and losing much of the potential benefit of these hardware components. Extending compilers to handle this heterogeneity is currently a manual task that is performed by compiler experts. Adapting current compilers is time consuming and error prone, does not address the challenge of taking advantage of hardware accelerators, and does not improve our ability to upgrade mission-critical systems in a timely manner.
The MOCHA program is designed to overcome these limitations by developing a new generation of compiler technology that leverages machine learning (ML) and advanced optimization techniques. The goal is to create compilers capable of automatically adapting to new hardware configurations, maximizing the performance of systems comprising multiple heterogeneous computational elements.
Background
Traditional compilers operate on a well-established “hourglass” architecture. At the top, language-specific front ends transform source code into a single intermediate representation (IR). This IR is then optimized before being translated into machine code by various hardware-specific backends. While this model has served the industry well, it struggles with the complexity of today’s computing environments, where a program may need to be executed across a mix of CPUs, GPUs, digital signal processors (DSPs), and other specialized hardware.
Today’s alphabet soup environment of CPUs, GPUs, Tensor Processing Units (TPUs), Field Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuit (ASICs), and System-on-a-Chip (SoCs) will become even more complex as new packaging technologies, such as chiplets and 3D integration, enable rapid revisions of an architecture, replacing one hardware component with another and easily incorporating wholly new components.
The traditional approach to compiler development is becoming increasingly unsustainable. Each component of a compiler is meticulously hand-crafted and fine-tuned, a process that is both time-consuming and prone to error. Moreover, as the variety of hardware accelerators and their complexity increases, maintaining and extending compilers to support new technologies is becoming prohibitively expensive.
The MOCHA program will accomplish this goal by 1) using data driven methods, machine learning (ML), and advanced optimization techniques to rapidly adapt compilers to new hardware components with little human effort and 2) developing new internal representations and programming languages that enable compilers to determine how to make optimal use of available hardware, rather than depending on humans to do so.
Without this capability, the Department of Defense (DoD) and the commercial world remain constrained by current compiler technologies and lack the ability to fully and rapidly capitalize on emerging specialized hardware.
Technical Area: Compiler Technology
The MOCHA program focuses on a single technical area: Compiler Technology. Within this broad domain, proposals are encouraged to address specific components of the compilation process or the entire toolchain. The program is driven by the need to automate the generation and optimization of performance models for different computational elements (CEs), leveraging ML and data-driven approaches to minimize human intervention.
Key areas of interest within the MOCHA program include:
The MOCHA program is focused on developing advanced compiler technology that can efficiently handle the complexities of modern, heterogeneous computing environments. Here’s a breakdown of the key areas of interest within the program:
Front-End Development
- Hardware-Agnostic Domain-Specific Languages: The front-end development focuses on creating domain-specific languages (DSLs) that are not tied to any specific hardware architecture. This hardware-agnostic approach allows for greater flexibility and portability, enabling the same code to be efficiently executed across different types of hardware.
- Automatic Partitioning and Mapping: These DSLs facilitate the automatic partitioning of computations into discrete modules or tasks. The compiler can then automatically map these tasks to the most suitable hardware components, whether they be CPUs, GPUs, or specialized accelerators. This automation reduces the need for manual intervention and ensures that the hardware is utilized optimally.
Middle-End Optimization
- Expanding Intermediate Representation (IR): Traditional compilers use a single IR to optimize code before generating machine code. However, this approach is insufficient for complex, heterogeneous systems. MOCHA aims to expand this by introducing multiple intermediate forms, each tailored to specific types of computational elements (CEs), such as GPUs, CPUs, or specialized accelerators.
- Machine Learning for Optimization: This stage will also involve applying machine learning (ML) techniques to optimize the selection and ordering of these intermediate forms. By learning from performance data, the compiler can make more informed decisions about which optimizations to apply and in what sequence, improving the overall efficiency and performance of the compiled code.
Back-End Code Generation
- Learning-Based Performance Models: The back-end of the compiler is responsible for generating the final machine code that runs on the hardware. MOCHA proposes using learning-based performance models to guide this process. These models will be trained on various criteria, such as throughput, power consumption, and memory footprint, to ensure that the generated code meets the desired performance objectives.
- Multi-Criteria Optimization: By considering multiple performance metrics, the compiler can balance trade-offs between them, such as optimizing for speed while minimizing power usage. This holistic approach ensures that the compiled code is not only efficient but also well-suited to the specific constraints of the hardware on which it will run.
End-to-End Optimization
- Frameworks and Representations: End-to-end optimization involves developing new frameworks and representations that can manage the vast search space of potential implementations for a given program. The goal is to find the best possible configuration that maximizes performance across all relevant metrics.
- Navigating Complex Search Spaces: The complexity of modern hardware environments means that there are many possible ways to implement a program. MOCHA’s end-to-end optimization techniques will help navigate this complexity, ensuring that the compiler can identify the most efficient way to execute a given piece of code, taking into account all the available hardware resources.
Program Structure
The MOCHA program is structured as a 36-month effort, divided into three performance years, each with specific goals and milestones. Proposals should assume an April 2025 start date, with regular assessments to ensure alignment with program objectives.
- Performance Year 1: Focus on back-end technology, specifically the rapid construction of performance models that guide code generation.
- Performance Year 2: Integration of data-driven and ML techniques to inform optimization selection and the construction of relevant intermediate forms.
- Performance Year 3: Development of front-end techniques for partitioning source code and mapping it to the most appropriate CEs.
Throughout the program, the experimental setup for assessment will involve an increasing number of CEs, distinct types of CEs, criteria for optimization, and the size of the source code being compiled.
Conclusion
The MOCHA program represents a bold step forward in compiler technology, addressing the critical need for tools that can efficiently handle the complexity of modern heterogeneous computing environments. By leveraging machine learning and advanced optimization techniques, DARPA aims to develop compilers that can automatically adapt to new hardware, unlocking the full potential of emerging technologies. As the MOCHA program progresses, it promises to deliver groundbreaking advancements that will benefit both the Department of Defense and the broader technology industry.