Volta (microarchitecture)


Volta is the codename, but not the trademark, for a GPU microarchitecture developed by Nvidia, succeeding Pascal. It was first announced on a roadmap in March 2013, although the first product was not announced until May 2017. The architecture is named after 18th-19th century Italian chemist and physicist Alessandro Volta. It was Nvidia's first chip to feature Tensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores. The architecture is produced with TSMC's 12 nm FinFET process. The Ampere microarchitecture is the successor to Volta.
The first graphics card to use it was the datacenter Tesla V100, e.g. as part of the Nvidia DGX-1 system. It has also been used in the Quadro GV100 and Titan V. There were no mainstream GeForce graphics cards based on Volta.
After two USPTO proceedings, on July 3, 2023 Nvidia lost the Volta trademark application in the field of artificial intelligence. The Volta trademark owner remains Volta Robots, a company specialized in AI and vision algorithms for robots and unmanned vehicles.

Details

Architectural improvements of the Volta architecture include the following:
  • CUDA Compute Capability 7.0
  • * concurrent execution of integer and floating point operations
  • TSMC's 12 nm FinFET process, allowing 21.1billion transistors.
  • High Bandwidth Memory 2,
  • NVLink 2.0: a high-bandwidth bus between the CPU and GPU, and between multiple GPUs. Allows much higher transfer speeds than those achievable by using PCI Express; estimated to provide 25 Gbit/s per lane.
  • Tensor cores: A tensor core is a unit that multiplies two 4×4 FP16 matrices, and then adds a third FP16 or FP32 matrix to the result by using fused multiply–add operations, and obtains an FP32 result that could be optionally demoted to an FP16 result. Tensor cores are intended to speed up the training of neural networks. Volta's Tensor cores are first generation while Ampere has third generation Tensor cores.
  • PureVideo Feature Set I hardware video decoding
Comparison of Compute Capability: GP100 vs GV100 vs GA100
GPU featuresNvidia Tesla P100Nvidia Tesla V100Nvidia A100
GPU codenameGP100GV100GA100
GPU architectureNvidia PascalNvidia VoltaNvidia Ampere
Compute capability6.07.08.0
Threads / warp323232
Max warps / SM646464
Max threads / SM204820482048
Max thread blocks / SM323232
Max 32-bit registers / SM655366553665536
Max registers / block655366553665536
Max registers / thread255255255
Max thread block size102410241024
FP32 cores / SM646464
Ratio of SM registers to FP32 cores102410241024
Shared Memory Size / SM64 KBConfigurable up to 96 KBConfigurable up to 164 KB

Comparison of Precision Support Matrix
Legend:
  • FPnn: floating point with nn bits
  • INTn: integer with n bits
  • INT1: binary
  • TF32: TensorFloat32
  • BF16: bfloat16
Comparison of Decode Performance
Concurrent streamsH.264 decode H.265 decode VP9 decode
V100162222
A10075157108

Products

Volta has been announced as the GPU microarchitecture within the Xavier generation of Tegra SoC focusing on self-driving cars.
At Nvidia's annual GPU Technology Conference keynote on May 10, 2017, Nvidia officially announced the Volta microarchitecture along with the Tesla V100. The Volta GV100 GPU is built on a 12 nm process size using HBM2 memory with 900 GB/s of bandwidth.
Nvidia officially announced the Nvidia TITAN V on December 7, 2017.
Nvidia officially announced the Quadro GV100 on March 27, 2018.

Application

Volta is also reported to be included in the Summit and Sierra supercomputers, used for GPGPU compute. The Volta GPUs will connect to the POWER9 CPUs via NVLink 2.0, which is expected to support cache coherency and therefore improve GPGPU performance.

V100 accelerator and DGX V100