Home / Technology / Electronics & EW / Hardware-software co-design (HSCD) of Electronic embedded systems

Hardware-software co-design (HSCD) of Electronic embedded systems

In the field of electronics, we see continuous advancements and changes in technology. These changes are not merely driven by innovation, but by demand as well. The continual integration of technology into every device in our personal and professional lives deems the need for smarter electronics. We expect more functionality from our devices as we put more and more demands on them.


Most electronic systems, whether self-contained or embedded, have a predominant digital component consisting of a hardware platform that executes software application programs. In the conventional design process, the hardware and software split of components is decided early, usually on an ad hoc basis, which leads to sub-optimal designs.


This often leads to difficulties when integrating the entire system at the end of the process by finding incompatibilities across the boundaries. As a consequence, it can directly impact the product time-to-market delaying its deployment. Most of all, this design process restricts the ability to explore hardware and software trade-offs, such as the movement of functionality from hardware to software and vice-versa, and their respective implementation, from one domain to other and vice-versa.


Embedded systems

An embedded system has 3 components:

It has the embedded hardware.
It has embedded software program.
It has an actual real-time operating system (RTOS) that supervises the utility software and offer a mechanism to let the processor run a process as in step with scheduling by means of following a plan to manipulate the latencies. RTOS defines the manner the system works. It units the rules throughout the execution of application software. A small scale embedded device won’t have RTOS.

Powerful on-chip features, like data and instruction caches, programmable bus interfaces and higher clock frequencies, speed up performance significantly and simplify system design. These hardware fundamentals allow Real-time Operating Systems (RTOS) to be implemented, which leads to the rapid increase of total system performance and functional complexity.

Embedded hardware are based around microprocessors and microcontrollers, also include memory, bus, Input/Output, Controller, where as embedded software includes embedded operating systems, different applications and device drivers.

Architecture of the Embedded System includes Sensor, Analog to Digital Converter, Memory, Processor, Digital to Analog Converter, and Actuators etc. Basically these two types of architecture i.e., Havard architecture and Von Neumann architecture are used in embedded systems.


Embedded Design

The process of embedded system design generally starts with a set of requirements for what the product must do and ends with a working product that meets all of the requirements. The requirements and product specification phase documents and defines the required features and functionality of the product. Marketing, sales, engineering, or any other individuals who are experts in the field and understand what customers need and will buy to solve a specific problem, can document product requirements.


Capturing the correct requirements gets the project off to a good start, minimizes the chances of future product modifications, and ensures there is a market for the product if it is designed and built. Good products solve real needs. have tangible benefits. and are easy to use.


Design Goals

The design of embedded systems can be subject to many different types of constraints or design goals. This includes performance including overall speed and deadlines.; Functionality and user interface, timing, size, weight, power consumption, Manufacturing cost, reliability, and cost. This process optimizes their performance under design constraints such as the size, weight, and power (SWaP) constraints of the final product.


Hardware Software Tradeoff

Certain subsystems in hardware (microcontroller), real-time clock, system clock, pulse width
modulation, timer and serial communication can also be implementable by software. Hardware implementations though increase the operation speed but may increase power requirements.

A serial communication, real-time clock and timers featuring microcontrollers may cost more than the microprocessor with external memory and a software implementation. However has simple coding for device drivers

Software implementation advantages
(i) Easier to change when new hardware versions become available
(ii) Programmability for complex operations
(iii) Faster development time
(iv) Modularity and portability

(v) Use of standard software engineering, modeling and RTOS tools.
(vi) Faster speed of operation of complex functions with high-speed microprocessors.
(vii) Less cost for simple systems

Hardware implementation advantages
(i) Reduced memory for the program
(ii) Reduced number of chips but at an increased cost
(iv) Internally embedded codes, more secure than at the external ROM


System Architecture

System architecture defines the major blocks and functions of the system. Interfaces. bus structure, hardware functionality. and software functionality are determined. System designers use simulation tools, software models, and spreadsheets to determine the architecture that best meets the system requirements. System architects provide answers to questions such as, “How many packets/sec can this muter design handle’?” or “What is the memory bandwidth required to support two simultaneous MPEG streams?”


Hardware design can be based on Microprocessors, field-programmable gate arrays (FPGAs), custom logic, etc.  Working with microcontrollers (and microprocessors) is all about software-based embedded design. Microprocessors are often very efficient: can use same logic to perform many different functions. Microprocessors simplify the design of products. The microcontrollers have their own instruction set which remains fixed in size and operation. While working on microcontrollers, an engineer uses the same instruction set by means of either assembly language or embedded C to solve certain computing tasks in a real-world application.

But there is another approach of embedded development as well – Hardware based Embedded Design. Field Programmable Gate Arrays (FPGA) was invented in 1984 by Xilinx. These are integrated circuits that contain millions of logic gates that can be electrically configured (i.e. the gates are field programmable) to perform certain tasks.

Any computer like microcontroller, microprocessor, graphic processor or Application Specific Integrated Circuit (ASIC) is basically a digital electronic circuit that can perform certain tasks based on an instruction set. The instruction set contains the machine codes that can be implemented by the digital circuitry of the computer on some data where the data is stored and manipulated on registers or memory chips. The FPGA takes the design to hardware level where an engineer can design a (simple) computing device from the architecture level and this simple computer is designed and fabricated to perform a specific application.

Though, FPGA can be used to design an ALU and other digital circuitry to perform simple computational tasks, it is in fact no match to a microcontroller or microprocessor in computing terms. A microprocessor or microcontroller is a true computing device with complex architecture. However, FPGA is quite comparable to Application Specific Integrated Circuits where any ASIC function can be custom designed and fabricated on FPGA.

Like microcontrollers are programmed using Assembly Language or a High Level Language (like C), FPGA chips are programmed using Verilog or VHDL language. Like C Code or assembly code is converted to machine code for execution on respective CPU, VHDL language converts to digital logic blocks that are then fabricated on FPGA chip to design a custom computer for specific application. Using VHDL or Verilog, an engineer designs the data path and ALU hardware from root level. Even a microprocessor or microcontroller can be designed on FPGA provided it has sufficient logic blocks to support such design.


Traditional design

The first step (milestone 1) architecture design is the specification of the embedded system, regarding functionality, power consumption, costs, etc. After completing this specification, a step called „partitioning“ follows. The design will be separated into two parts:
• A hardware part, that deals with the functionality implemented in hardware add-on components like ASICs or IP cores.
• A software part, that deals with code running on a microcontroller, running alone or together with a real-time-operating system (RTOS)

Microprocessor Selection. One of the most difficult steps in embedded system design can be the choice of the microprocessor. There are an endless number of ways to compare microprocessors, both technical and nontechnical. Important factors include performance. cost. power, software development tools, legacy software, RTOS choices. and available simulation models.

The second step is mostly based on the experience and intuition of the system designer. After completing this step, the complete hardware architecture will be designed and implemented (milestones 3 and 4). After the target hardware is available, the software partitioning can be implemented.

The last step of this sequential methodology is the testing of the complete system, which means the evaluation of the behavior of all the hardware and software components. Unfortunately developers can only verify the correctness of their hardware/software partitioning in this late development phase. If there are any uncorrectable errors, the design flow must restart from the beginning, which can result in enormous costs.

At this time, our world is growing in complexity, and there is an emphasis on architectural improvements that cannot be achieved without hardware-software co-design. There is also an increasing need for our devices to be scalable to stay on par with both advancements and demand. Hardware-software co-design, with the assistance of machine learning, can help to optimize hardware and software in everything from IP to complex systems, based upon a knowledge base of what works best for which conditions.


Hardware/software co-design

The complexity of designing electronic systems and products is constantly increasing. The increasing complexity is due to the factors such as: portability, increased complexities of software and hardware, low power and high speed applications etc. Due to all these factors the electronic system design is moving towards System on Chip (SoC) with heterogeneous components like DSP, FPGA etc. This concept of integrating hardware and software components together is moving towards Hardware Software co design (HSCD).

Hardware/software co-design aims for the cooperative and unification of hardware and software  components. Hardware/software co-design means meeting system-level objectives by exploiting the synergism of hardware and software through their concurrent design

Most examples of systems today are either electronic in nature (e.g., information processing systems) or contain an electronic subsystem for monitoring and control (e.g., plant control). Many systems can be partitioned in to data
unit and control unit. The data unit performs different operations on data elements like addition, subtraction
etc. The control unit controls the operations of data unit by using control signals.

The total design of data and control units can be done by using Software only, Hardware only, or Hardware/Software Co-design methodologies. The selection of design methodology can be done by using different non functional constraints like area, speed, power, cost etc. The software design methodology can be selected for the systems with specifications as less timing related issues and less area constraints. Using the software design system less area and low speed systems can be designed. To design a system with high speed, timing issues need to be considered. The hardware design methodology is one solution to design high speed systems with more area compared to software designs.

Because of present SoC designs, systems with high speed, less area, portability, low power have created the need of combining the hardware and software design methodologies called as Hardware/Software Co-Design. The co-design can be defined as the process of designing and integrating different components on to a single IC or a system. The components can be a hardware component like ASIC, software component like microprocessor, microcontroller, electrical component or a mechanical component etc.

Hardware-software co-design has many benefits that will pay dividends now and in the future. For the PCB industry, it will increase manufacturing efficiency, the innovation of designs, lower cost, and shorten the time of prototypes to market. In terms of the use of machine learning, it also reduces input variation analysis by removing those variables that are already calculated to fail. This will shorten the development time of designs and improve those designs with the same amount of accuracy but at a lower cost.

Depending on your design parameters, you can reduce the simulation times and still maintain the accuracy of your designs. The by-product of hardware-software co-designs optimizes designs, simulations, and overall analysis. You are thereby reducing total production time to just hours or days instead of weeks. These concepts are already in practice in our automated production systems, power grid, the automotive industry, and aviation, to name a few.

Sometimes, it is not technology that gets in the way. “It requires an organizational change,” says Saha. “You can’t have separate software and hardware teams that never talk to each other. That boundary must be removed. What we are seeing is that while many are still different teams, they report through the same hierarchy or they have much closer collaboration. I have seen cases where the hardware group has an algorithm person reporting to the same manager. This helps in identifying the implementability of the algorithm and allows them to make rapid iterations of the software.”


Hardware-software co-design process: Co-specification, Co-synthesis, and Co-simulation/Co-verification

Co-design focuses on the areas of system specification, architectural design, hardware-software partitioning and iteration between hardware and software as design progresses. Finally, co-design is complimented by hardware-software integration and tested.

Co-Specification: Developing system specification that describes hardware, software modules and relationship between the hardware and software

Co-Synthesis: Automatic and semi-automatic design of hardware and software modules to meet the specification

Co-Simulation and Co-verification: Simultaneous simulation of hardware and software


HW/SW Co-Specification

The first step in this approach focuses on a formal specification of a system design . This specification does not focus on concrete hardware or software architectures, like special microcontrollers or IP-cores. Using several of the methods from mathematics and computer sciences, like petri-nets, data flow graphs, state machines and parallel programming languages; this methodology tries to build a complete description of the system’s behavior.


The result is a decomposition of the system’s functional behavior, it takes the form of a set of components which implements parts of the global functionality. Due to the use of formal description methods, it is possible to find different alternatives to the implementation of these components.

The co-design of HW/SW systems may be viewed as composed of four main phases as illustrated in the  diagram:

  1. Modeling
  2. Partitioning
  3. Co-Synthesis
  4. C-Simulation


Modeling involves specifying the concepts and the constraints of the system to obtain a refined specification. This phase of the design also specifies a software and hardware model. The first problem is to find a suitable specification methodology for a target system. Some researchers favour a formal language that can yield provably-correct code.

There are three different paths the modeling process can take, considering its starting point:

There are three different paths the modeling process can take, considering its starting point:

  • An existing software implementation of the problem.
  • An existing hardware, for example a chip, is present.
  • None of the above is given, only specifications leaving an open choice for a model.

Hierarchical Modelling methodology

Hierarchical modeling methodology calls for precisely specifying the system’s functionality and exploring system-level implementations.

To create a system-level design, the following steps should be taken:

  1. Specification capture: Decomposing functionality into pieces by creating a conceptual model of the system. The result is a functional specification, which lacks any implementation detail.
  2. Exploration: Exploration of design alternatives and estimating their quality to find the best suitable one.
  3. Specification: The specification as noted in 1. is now refined into a new description reflecting the decisions made during exploration as noted in 2.
  4. Software and hardware: For each of the components an implementation is created, using software and hardware design techniques.
  5. Physical design: Manufacturing data is generated for each component


There are many models for describing a system’s functionality:

  1. Dataflow graph. A dataflow graph decomposes functionality into data-transforming activities and the dataflow between these activities.
  2. Finite-State Machine (FSM). By this model the system is represented as a set of states and a set of arcs that indicate transition of the system from one state to another as a result of certain occurring events.
  3. Communicating Sequential Processes (CSP). This model decomposes the system into a set of concurrently executing processes, processes that execute program instructions sequentially.
  4. Program-State Machine (PSM). This model combines FSM and CSP by permitting each state of a concurrent FSM to contain actions, described by program instructions.

Each model has its own advantages and disadvantages. No model is perfect for all classes of systems, so the best one should be chosen, matching closely as possible the characteristics of the system into the models.


To specify functionality, several languages are commonly used by designers. VHDL and Verilog are very popular standards because of the easy description of a CSP model through their process and sequential-statement constructs. But most prefer a hardware-type language (e.g., VHDL, Verilog), a software-type language (C, C++, Handel-C, SystemC), or other formalism lacking a hardware or software bias (such as Codesign Finite State Machines).


Partitioning: how to divide specified functions between hardware, software and Interface

The next step is a process called hardware/software partitioning. The functional components found in step one can be implemented either in hardware or in software. The goal of the partitioning process is an evaluation of these hardware/software alternatives, given constraints such as time, size, cost and power. Depending on the properties of the functional parts, like time complexity of algorithms, the partitioning process tries to find the best of these alternatives. This evaluation process is based on different conditions, such as metric functions like complexity or the costs of implementation


Recent reports indicate that automatic partitioning is currently making little headway, and that researchers are turning to semiautomatic “design space exploration,” relying on tools for fast evaluation of user-directed partitions.


In general, FPGA or ASIC-based systems consist of:

  • Own HDL code
  • IP blocks of the FPGA, ASIC manufacturer
  • Purchased IP blocks

In addition, various software components such as:

  • Low level device drivers
  • Possibly an operating system
  • Possibly a High-Level API (Application Programmable Interface)
  • The application software

Another important aspect is the central “Interface” submodule, which in normal system design is often left on the sideline, causing disastrous effects at the time of integration. Given that many embedded systems which use codesign methodologies are often implemented at a very low level of programming and details (e.g. assembly code), the proper development of an effective interface becomes extremely important, even more so from the view that any reconfiguration of the design will change the critical interface modules!


When designing System On Chip components, the definition of the hardware-software interface plays a major role. Especially for larger teams working on complex SoCs, it must be ensured that addresses are not assigned more than once and that the address assignment in the hardware matches the implementation on the software side.


Cosynthesis: generating the hardware and software components

After a set of best alternatives is found, the next step is the implementation of the components. This includes  hardware sythesis, software synthesis and interface synthesis.

Co-synthesis uses the available tools to synthesize the software, hardware and interface implementation. This is done concurrently with as much interaction as possible between the three implementations.

An essential goal of today’s research is to find and optimize algorithms for the evaluation of partitioning. Using these algorithms, it is theoretically possible to implement hardware /
software co-design as an automated process.

Hardware components can be implemented in languages like VHDL, software is coded using programming languages like Java, C or C++. Hardware synthesis is built on existing CAD tools, typically via VHDL or Verilog.

Software synthesis is usually in any high level language. Codesign tools should generate hardware/software interprocess communication automatically, and schedule software processes to meet timing constraints .

All potentially available components can be analyzed using criteria like functionality, technological complexity, or testability. The source of the criteria used can be data sheets, manuals, etc. The result of this stage is a set of components for potential use, together with a ranking of them.

DSP software is a particular challenge, since few good compilers exist for these idiosyncratic architectures. Retargetable compilers for DSPs, ASIPs (application specific instruction processors), and processor cores are a special research problem.

The automatic generation of hardware from software has been a goal of academia and industry for several decades, and this led to the development of high-level synthesis (HLS).


The last step is system integration. System integration puts all hardware and software components together and evaluates if this composition complies with the system specification, done in step one. If not, the hardware/software partitioning process starts again.

Due to the algorithm-based concept of hardware/software co-design there are many advantages to this approach. The system design can be verified and modified at an early stage in the
design flow process. Nevertheless, there are some basic restrictions which apply to the use of this methodology:

• Insufficient knowledge: Hardware/software codesign is based on the formal description of the system and a decomposition of its functionality. In order to commit to real applications, the system developer has to use available components, like IP-cores. Using this
approach, it is necessary to describe the behavior and the attributes of these components completely. Due to the blackbox nature of IP-cores, this is not possible in all cases.

• Degrees of freedom: Another of the building blocks of hardware/software codesign is the unrestricted substitution of hardware components by software components and vice versa. For real applications, there are only a few degrees of freedom in regards to the microcontroller, but for ASIC or IP-core components, there is a much greater degree of freedom. This is due to the fact that there are many more IP cores than microcontrollers that can be used for dedicated
applications, available.


Co-simulation: evaluating the synthesized design

With the recent incidents of aircraft crashes, there is an increasing need for better testing and diagnosis of faults before they become a problem. Which leads to the need for better designs and design decision making. As you may know, the best way to perfect any design is through simulation. It saves time, lowers cost, increases safety, and improves the overall design.


Co-simulation executes all three components, functioning together in real time. This phase helps with the verification of the original design and implied constraints by verifying if input and output data are presented as expected.


Verification: Does It Work?

Embedded system verification refers to the tools and techniques used to verify that a system does not have hardware or software bugs. Software verification aims to execute the software and observe its behavior, while hardware verification involves making sure the hardware performs correctly in response to outside stimuli and the executing software.


Validation: Did We Build the Right Thing?

Embedded system validation refers to the tools and techniques used to validate that the system meets or exceeds the requirements. Validation aims to confirm that the requirements in areas such as functionality, performance, and power are satisfied. It answers the question, “Did we build the right thing?’ Validation confirms that the architecture is correct and the system is performing optimally.


Impact of AI and Machine Learning (ML)

Artificial Intelligence (AI) and Machine Learning (ML) technologies are changing the way we look at technology and our possible future.


The rapid development of AI has flipped the focus from a hardware-first to a software-first flow. “Understanding AI and ML software workloads is the critical first step to beginning to devise a hardware architecture,” says Lee Flanagan, CBO for Esperanto Technologies.


“Workloads in AI are abstractly described in models, and there are many different types of models across AI applications. These models are used to drive AI chip architectures. For example, ResNet-50 (Residual Networks) is a convolutional neural network, which drives the needs for dense matrix computations for image classification. Recommendation systems for ML, however, require an architecture that supports sparse matrices across large models in a deep memory system.”


Specialized hardware is required to deploy the software when it has to meet latency requirements. “Many AI frameworks were designed to run in the cloud because that was the only way you could get 100 processors or 1,000 processors,” says Imperas’ Davidmann. “What’s happening nowadays is that people want all this data processing in the devices at the endpoint, and near the edge in the IoT.


This is software/hardware co-design, where people are building the hardware to enable the software. They do not build a piece of hardware and see what software runs on it, which is what happened 20 years ago. Now they are driven by the needs of the software.”  “In AI, optimizing the hardware, AI algorithm, and AI compiler is a phase-coupled problem. They need to be designed, analyzed, and optimized together to arrive at an optimized solution. As a simple example, the size of the local memory in an AI accelerator determines the optimal loop tiling in the AI compiler,” says Tim Kogel, principal applications engineer at Synopsys.


While AI is the obvious application, the trend is much more general than that. “As stated by Hennessy/Patterson, AI is clearly driving a new golden age of computer architecture,” says Synopsys’ Kogel. “Moore’s Law is running out of steam, and with a projected 1,000X growth of design complexity in the next 10 years, AI is asking for more than Moore can deliver. The only way forward is to innovate the computer architecture by tailoring hardware resources for compute, storage, and communication to the specific needs of the target AI application.”


Economics is still important, and that means that while hardware may be optimized for one task, it often has to remain flexible enough to perform others. “AI devices need to be versatile and morph to do different things,” says Cadence’s Young. “For example, surveillance systems can also monitor traffic. You can count how many cars are lined up behind a red light. But it only needs to recognize a cube, and the cube behind that, and aggregate that information. It does not need the resolution of a facial recognition. You can train different parts of the design to run at different resolution or different sizes. When you write a program for a 32-bit CPU, that’s it. Even if I was only using 8-bit data, it still occupies the entire 32-bit, pathway. You’re wasting the other bits. AI is influencing how the designs are being done.”


“AI applications demand a holistic approach,” says Esperanto’s Flanagan. “This spans everyone from low-power circuit designers to hardware designers, to architects, to software developers, to data scientists, and extending to customers, who best understand their important applications.”


Outside of AI, the same trend in happening in other domains, where the processing and communication requirements outpace the evolution of general-purpose compute. “In datacenters, a new class of processing units for infrastructure and data-processing task (IPU, DPU) have emerged,” adds Kogel. “These are optimized for housekeeping and communication tasks, which otherwise consume a significant portion of the CPU cycles. Also, the hardware of extreme low-power IoT devices is tailored for the software to reduce overhead power and maximize computational efficiency.”


As processing platforms become more heterogenous, that makes the problem a lot more difficult. “You no longer have a simple ISA layer on which the software sits,” says Anoop Saha, senior manager for strategy and business development at Siemens EDA. “The boundaries have changed. Software algorithms should be easily directed toward a hardware endpoint. Algorithm guys should be able to write accelerator models. For example, they can use hardware datatypes to quantize their algorithms, and they should do this before they finalize their algorithms. They should be able to see if something is synthesizable or not. The implementability of an algorithm should inherently be a native concept to the software developer. We have seen some change in this area. Our algorithmic datatypes are open source, and we have seen around two orders of magnitude more downloads of that than the number of customers.”


References and Resources also include:









About Rajesh Uppal

Check Also

Navigating the Cosmos: The Marvels of Spacecraft Avionics

Introduction: The exploration of outer space has been one of humanity’s greatest achievements, driven by …

error: Content is protected !!