ARM Cortex-A76
The ARM Cortex-A76 is a central processing unit core implementing the 64-bit ARMv8.2-A architecture, designed by Arm Holdings' design center in Austin, Texas. Compared to its predecessor, the Cortex-A75, ARM claimed performance improvements of up to 25% in integer operations and 35% in floating-point operations.
Design
The Cortex-A76 is a successor to both the Cortex-A73 and Cortex-A75, though it is based on an entirely new microarchitecture. It features a 4-wide decode, out-of-order, superscalar pipeline. The frontend can fetch and decode four instructions per cycle and dispatch up to four macro-operations and eight micro-operations per cycle. The out-of-order execution window includes 128 entries. The backend includes eight execution ports, with a pipeline depth of 13 stages and execution latencies of 11 stages.The Cortex-A76 supports unprivileged 32-bit applications, but privileged software, such as operating systems and kernels, must use the 64-bit ARMv8-A instruction set. Additional features include support for ARMv8.3-A's LDAPR instructions, ARMv8.4-A's dot product instructions, and ARMv8.5-A's speculative execution controls such as SSBS, CSDB, SSBB, and PSSBB.
Memory bandwidth is improved by up to 90% over the Cortex-A75. ARM targeted the Cortex-A76 for high-performance computing, including Windows 10 laptops, positioning it as a competitor to Intel’s Kaby Lake architecture.
The Cortex-A76 also supports ARM DynamIQ technology, and is often paired with energy-efficient Cortex-A55 cores in multi-core configurations.
Usage
The Cortex-A76 is available as a semiconductor intellectual property core and can be licensed by manufacturers for integration into custom system on a chip designs. It is commonly combined with other components such as graphics processing units, digital signal processors, and image signal processors on a single chip.The Cortex-A76 first appeared in the HiSilicon Kirin 980 SoC. The company's later Kirin 985 and 990 series of SoCs would also use the A76.
ARM collaborated with Qualcomm on semi-custom versions of the Cortex-A76 used in several of its Kryo CPU designs, including the Kryo 495, Kryo 485, Kryo 470, and Kryo 460. Qualcomm made several architectural modifications, such as increasing the reorder buffer to expand the out-of-order execution window.
Other SoCs using the Cortex-A76 include:
- Allwinner: A733
- Broadcom BCM2712 SoC with four A76 cores. Used in the Raspberry Pi 5.
- Google Tensor
- HiSilicon: Kirin 810, 820, 980, 985, 990
- Intel Agilex D-series SoC FPGAs
- MediaTek: MT6781, MT6785V, MT6789, MT6833V/P, MT6853V/T, MT6873, MT6875, MT8192, Dimensity 6020, 6080, 6100+, 6300, 6400, Helio G90, G90T, G95, G99, and Dimensity 800 and 820
- Microsoft: SQ1, SQ2
- Qualcomm: Snapdragon 480, 675, 678, 720G, 730, 732G, 765, 768G, 855, 860, 7c, 8c, 8cx
- Rockchip: RK3588, RK3588S
- Samsung: Exynos 990, Exynos Auto V9
- UNISOC: S713, S752, S762, S913, T750, T760, T765, T770, T820, T8100, T8200, T9100