A graphics processing unit (GPU) is a specialized, electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. Modern GPUs are very efficient at manipulating computer graphics and image processing. GPUs were traditionally tasked with compute-intensive, floating-point graphics functions such as 3D rendering and texture mapping. Their highly parallel structure makes them more efficient than general-purpose central processing units (CPUs) for algorithms that process large blocks of data in parallel.
In a personal computer, a GPU can be present on a video card or embedded on the motherboard. In certain CPUs, they are embedded on the CPU die. GPUs are used in embedded systems, mobile phones, personal computers, workstations, and game consoles.
Deep learning is a subset of machine learning that does not rely on structured data to develop accurate predictive models. This method uses networks of algorithms modeled after neural networks in the brain to distill and correlate large amounts of data. The more data you feed your network, the more accurate the model becomes.
You can functionally train deep learning models using sequential processing methods. However, the amount of data needed and the length of data processing make it impractical if not impossible to train models without parallel processing. Parallel processing enables multiple data objects to be processed at the same time, drastically reducing training time. This parallel processing is typically accomplished through the use of graphical processing units (GPUs).
GPUs are specialized processors created to work in parallel. These units can provide significant advantages over traditional CPUs, including up to 10x more speed. Typically, multiple GPUs are built into a system in addition to CPUs. While the CPUs can handle more complex or general tasks, the GPUs can handle specific, highly repetitive processing tasks.
The rapid increases in data-intensive applications demand for more powerful parallel computing systems capable of parallel processing a large amount of data more efficiently and effectively. While GPU-based systems are commonly used in such parallel processing, the exponentially rising data volume can easily saturate the capacity of the largest possible GPU processor. One possible solution is to exploit multi-GPU systems.
Multi-GPU uses two similar/same GPUs together for better graphics performance. There are two such setups – SLI, which is for Nvidia GPUs, and CrossfireX, which is for AMD GPUs. Multi-GPU setups are usually a sort of last resort thing, because running a multi-GPU consumes lots of power, and does not give twice the performance as a single GPU. It’s always recommended to get a single powerful GPU rather than two slower ones. Another problem being PCI-E lanes. Basically, there are lanes that connect the CPU to the GPU. When using two GPUs, the same number of lanes are split between the two GPUs, which reduces the communication speed between GPU and CPU. This causes problems like micro-stuttering, which is basically stuttering, but a different kind. In a multi-GPU system, the main bottleneck is the interconnect, which is currently based on PCIe or NVLink technologies.
Nvidia and TSMC working on multi-GPU solutions based on silicon photonics
TSMC is involved in an R&D project led by Nvidia to use its silicon photonic (SiPh) integration technology called COUPE (compact universal photonic engine) for graphics hardware to combine multiple AI GPUs, according to industry sources. The interconnected GPUs would benefit from low-latency data transmissions and significantly reduced signal loss.
DigiTimes specifies that the SiPh project is called COUPE (compact universal photonic engine) and uses CMOS circuitry similar to those found in digital cameras. SiPh chips can be combined with CMOS processes and integrated together with co-packaged optics (CPO). Multiple AI GPUs can thus be interconnected via a chip-on-wafer-on-substrate (CoWoS) 2.5D package. This improved method benefits from the low-latency advantage of optical data transmissions, which also ensures a significantly reduced signal loss between a larger group of interconnected GPUs.
The first GPUs built with TSMC’s SiPh tech are expected to launch in at least a few years. Apparently, Intel was also planning to develop a similar technology with Taiwanese epiwafer maker LandMark Optoelectronics. TSMC avoided any patent infringements by placing the laser light transceivers externally rather than internally, which would reduce transfer speeds, but also ensure a higher yield rate.
References and Resources also include: