Heterogeneous computing
Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incorporating specialized processing capabilities to handle particular tasks.
Heterogeneity
Usually heterogeneity in the context of computing refers to different instruction-set architectures, where the main processor has one and other processors have another - usually a very different - architecture, not just a different microarchitecture.In the past heterogeneous computing meant different ISAs had to be handled differently, while in a modern example, Heterogeneous System Architecture systems eliminate the difference while using multiple processor types, usually on the same integrated circuit, to provide the best of both worlds: general GPU processing, while CPUs can run the operating system and perform traditional serial tasks.
The level of heterogeneity in modern computing systems is gradually increasing as further scaling of fabrication technologies allows for formerly discrete components to become integrated parts of a system-on-chip, or SoC. For example, many new processors now include built-in logic for interfacing with other devices, as well as programmable functional units and hardware accelerators.
Recent findings show that a heterogeneous-ISA chip multiprocessor that exploits diversity offered by multiple ISAs can outperform the best same-ISA homogeneous architecture by as much as 21% with 23% energy savings and a reduction of 32% in Energy Delay Product. AMD's 2014 announcement on its pin-compatible ARM and x86 SoCs, codename Project Skybridge,
suggested a heterogeneous-ISA chip multiprocessor in the making.
Heterogeneous CPU topology
A system with heterogeneous CPU topology is a system where the same ISA is used, but the cores themselves are different in speed. The setup is more similar to a symmetric multiprocessor. There are typically two types of cores: a higher performance core usually known as a "big" or P-core and a more power efficient core usually known as a "small" or E-core. The terms P- and E-cores are usually used in relation to Intel's implementation of hetereogeneous computing, while the terms big and little cores are usually used in relation to the ARM architecture. Some processors have three categories of core, prime, performance and efficiency cores, with prime cores having higher performance than performance cores; a prime core is known as "big", a performance core is known as "medium", and an efficiency core is known as "small".A common use of such topology is to provide better power efficiency, especially in mobile SoCs.
- ARM big.LITTLE is the prototypical case, where faster high-power cores are combined with slower low-power cores.
- Apple has produced Apple silicon SoCs with similar organization.
- Intel has also produced hybrid x86-64 chips codenamed Lakefield, although not without major limitations in instruction set support. The newer Alder Lake reduces the sacrifice by adding more instruction set support to the "small" core.
Challenges
Heterogeneous computing systems present new challenges not found in typical homogeneous systems. The presence of multiple processing elements raises all of the issues involved with homogeneous parallel processing systems, while the level of heterogeneity in the system can introduce non-uniformity in system development, programming practices, and overall system capability. Areas of heterogeneity can include:; ISA or instruction-set architecture
; ABI or application binary interface
; API or application programming interface
; Low-Level Implementation of Language Features
; Memory Interface and Hierarchy
; Interconnect
; Performance
;Development tools
;Data Partitioning
Example hardware
Heterogeneous computing hardware can be found in every domain of computing—from high-end servers and high-performance computing machines all the way down to low-power embedded devices including mobile phones and tablets.- High Performance Computing
- * Cydra-5
- * Cray XD1
- * SRC Computers SRC-6 and SRC-7
- Embedded Systems
- *Texas Instruments OMAP
- * Analog Devices Blackfin
- *Qualcomm Snapdragon
- *Nvidia Tegra
- *Samsung Exynos
- *Apple "A" series
- *Movidius Myriad Vision processing units, which includes several symmetric processors, complemented by fixed function units, and a pair of SPARC based controllers.
- *HiSilicon Kirin SoCs
- *MediaTek SoCs
- *Cadence Design Systems Tensilica DSPs
- Reconfigurable Computing
- * Xilinx Field-programmable gate array and Zynq and Versal Platforms
- * Intel "Stellarton"
- Networking
- * Intel IXP Network Processors
- General Purpose Computing, Gaming, and Entertainment Devices
- *Intel Sandy Bridge, Ivy Bridge, and Haswell CPUs
- * AMD Excavator and Ryzen APUs
- * IBM Cell, found in the PlayStation 3
- ** SpursEngine, a variant of the IBM Cell processor
- * Emotion Engine, found in the PlayStation 2
- *ARM big.LITTLE/DynamIQ CPU architecture
- ** Nearly all ARM vendors offer heterogeneous solutions; ARM, Qualcomm, Nvidia, Apple, Samsung, HiSilicon, MediaTek, etc.