Ampere (microarchitecture)


Ampere is the codename for a graphics processing unit microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020, and is named after French mathematician and physicist André-Marie Ampère.
Nvidia announced the Ampere architecture GeForce 30 series consumer GPUs at a GeForce Special Event on September 1, 2020. Nvidia announced the A100 80 GB GPU at SC20 on November 16, 2020. Mobile RTX graphics cards and the RTX 3060 based on the Ampere architecture were revealed on January 12, 2021.
Nvidia announced Ampere's successor, Hopper, at GTC 2022, and "Ampere Next Next" for a 2024 release at GPU Technology Conference 2021.

Details

Architectural improvements of the Ampere architecture include the following:
  • CUDA Compute Capability 8.0 for A100 and 8.6 for the GeForce 30 series
  • TSMC's 7 nm FinFET process for A100
  • Custom version of Samsung's 8 nm process for the GeForce 30 series
  • Third-generation Tensor Cores with FP16, bfloat16, TensorFloat-32 and FP64 support and sparsity acceleration. The individual Tensor cores have with 256 FP16 FMA operations per clock 4x processing power compared to previous Tensor Core generations; the Tensor Core Count is reduced to one per SM.
  • Second-generation ray tracing cores; concurrent ray tracing, shading, and compute for the GeForce 30 series
  • High Bandwidth Memory 2 on A100 40 GB & A100 80 GB
  • GDDR6X memory for GeForce RTX 3090, RTX 3080 Ti, RTX 3080, RTX 3070 Ti
  • Double FP32 cores per SM on GA10x GPUs
  • NVLink 3.0 with a 50 Gbit/s per pair throughput
  • PCI Express 4.0 with SR-IOV support
  • Multi-instance GPU virtualization and spatial GPU partitioning feature in A100 supporting up to seven instances
  • PureVideo feature set K hardware video decoding with AV1 hardware decoding for the GeForce 30 series and feature set J for A100
  • 5 NVDEC for A100
  • Adds new hardware-based 5-core JPEG decode with YUV420, YUV422, YUV444, YUV400, RGBA. Should not be confused with Nvidia '''NVJPEG'''

    Chips

  • GA100
  • GA102
  • GA103
  • GA104
  • GA106
  • GA107
  • GA10B
Comparison of Compute Capability: GP100 vs GV100 vs GA100
GPU featuresNvidia Tesla P100Nvidia Tesla V100Nvidia A100
GPU codenameGP100GV100GA100
GPU architecturePascalVoltaAmpere
Compute capability6.07.08.0
Threads / warp323232
Max warps / SM646464
Max threads / SM204820482048
Max thread blocks / SM323232
Max 32-bit registers / SM655366553665536
Max registers / block655366553665536
Max registers / thread255255255
Max thread block size102410241024
FP32 cores / SM646464
Ratio of SM registers to FP32 cores102410241024
Shared Memory Size / SM64 KBConfigurable up to 96 KBConfigurable up to 164 KB

Comparison of Precision Support Matrix
Legend:
  • FPnn: floating point with nn bits
  • INTn: integer with n bits
  • INT1: binary
  • TF32: TensorFloat32
  • BF16: bfloat16
Comparison of Decode Performance
Concurrent streamsH.264 decode H.265 decode VP9 decode
V100162222
A10075157108

Ampere dies

A100 accelerator and DGX A100

The Ampere-based A100 accelerator was announced and released on May 14, 2020. The A100 features 19.5 teraflops of FP32 performance, 6912 FP32/INT32 CUDA cores, 3456 FP64 CUDA cores, 40 GB of graphics memory, and 1.6 TB/s of graphics memory bandwidth. The A100 accelerator was initially available only in the 3rd generation of DGX server, including 8 A100s. Also included in the DGX A100 is 15 TB of PCIe gen 4 NVMe storage, two 64-core AMD Rome 7742 CPUs, 1 TB of RAM, and Mellanox-powered HDR InfiniBand interconnect. The initial price for the DGX A100 was $199,000.

Products using Ampere

TypeGA10BGA107GA106GA104GA103GA102GA100
GeForce MX seriesGeForce MX570
GeForce 20 seriesGeForce RTX 2050
GeForce 30 seriesGeForce RTX 3050 Laptop
GeForce RTX 3050
GeForce RTX 3050 Ti Laptop
GeForce RTX 3050
GeForce RTX 3060 Laptop
GeForce RTX 3060
GeForce RTX 3060
GeForce RTX 3060 Ti
GeForce RTX 3070 Laptop
GeForce RTX 3070
GeForce RTX 3070 Ti Laptop
GeForce RTX 3070 Ti
GeForce RTX 3080 Laptop
GeForce RTX 3060 Ti
GeForce RTX 3080 Ti Laptop
GeForce RTX 3070 Ti
GeForce RTX 3080
GeForce RTX 3080 Ti
GeForce RTX 3090
GeForce RTX 3090 Ti
Nvidia Workstation GPUsRTX A1000 RTX A2000
RTX A2000
RTX A3000
RTX A4000
RTX A4000
RTX A5000
RTX A5500 RTX A4500
RTX A5000
RTX A5500
RTX A6000
Nvidia Data Center GPUsNvidia A2
Nvidia A16
Nvidia A10
Nvidia A40
Nvidia A30
Nvidia A100
Tegra SoCsAGX Orin
Orin NX
Orin Nano