P6 (microarchitecture)
The P6 microarchitecture is the sixth-generation Intel x86 microarchitecture, first implemented in the Pentium Pro microprocessor in 1995. It was partially succeeded by the NetBurst microarchitecture used by the Pentium 4 in 2000, though it continued to be used in new processors through the mid-2000s. The Core microarchitecture, a derivative of the Enhanced Pentium M variant of P6, would later succeed both P6 and NetBurst.
P6 was used within Intel's mainstream offerings from the Pentium Pro to Pentium III, and was widely known for low power consumption, excellent integer performance, and relatively high instructions per cycle.
Features
The P6 core was the sixth generation Intel microprocessor in the x86 line. The first implementation of the P6 core was the Pentium Pro CPU in 1995, the immediate successor to the original Pentium design.P6 processors dynamically translate IA-32 instructions into sequences of buffered RISC-like micro-operations, then analyze and reorder the micro-operations to detect parallelizable operations that may be issued to more than one execution unit at once. The Pentium Pro was the first x86 microprocessor designed by Intel to use this technique, though the NexGen Nx586, introduced in 1994, did so earlier.
Other features first implemented in the x86 space in the P6 core include:
- Speculative execution and out-of-order execution, which required new retire units in the execution core. This lessened pipeline stalls, and in part enabled greater speed-scaling of the Pentium Pro and successive generations of CPUs.
- Superpipelining, which increased from Pentium's 5-stage pipeline to 14 of the Pentium Pro and early model of the Pentium III, and eventually morphed into less than 10-stage pipeline of the Pentium M for embedded and mobile market due to energy inefficiency and higher voltage issues that encountered in the predecessor, and then again lengthening the 10- to 12-stage pipeline back to the Core 2 due to facing difficulty increasing clock speed while improving fabrication process can somehow negate some negative impact of higher power consumption on the deeper pipeline design.
- A front-side bus using a variant of Gunning transceiver logic to enable four discrete processors to share system resources.
- Physical Address Extension and a wider 36-bit address bus to support 64 GB of physical memory.
- Register renaming, which enabled more efficient execution of multiple instructions in the pipeline.
- CMOV instructions, which are heavily used in compiler optimization.
- Other new instructions: FCMOV, FCOMI/FCOMIP/FUCOMI/FUCOMIP, RDPMC, UD2.
- New instructions in Pentium II Deschutes core: MMX, FXSAVE, FXRSTOR.
- New instructions in Pentium III: Streaming SIMD Extensions.
P6 based chips
- Celeron
- Pentium Pro
- Pentium II Overdrive
- Pentium II
- Pentium II Xeon
- Pentium III
- Pentium III Xeon
P6 Variant Pentium M
Design Overview
- Quad-pumped front-side bus. With the initial Banias core, Intel adopted the 400 MT/s FSB first used in Pentium 4. The Dothan core moved to the 533 MT/s FSB, following Pentium 4's evolution.
- Larger L1/L2 cache. L1 cache increased from predecessor's 32 KB to current 64 KB in all models. Initially 1 MB L2 cache in the Banias core, then 2 MB in the Dothan core. Dynamic cache activation by quadrant selector from sleep states.
- SSE2 Streaming SIMD Extensions 2 support.
- A 10- or 12-stage Enhanced instruction pipeline that allows for higher clock speeds while retaining the efficiency advantages associated with a shorter pipeline.
- Dedicated register stack management.
- An overhauled branch predictor. Addition of global history, indirect prediction, and loop prediction to the branch prediction table. Removal of local prediction.
- Micro-operation fusion of certain sub-instructions. This allows some x86 instructions to result in fewer micro-operations, which allows individual instructions to execute faster and can increase the number of instructions executed simultaneously.
The first Pentium M family processors internally support PAE but do not show the PAE support flag in their CPUID information; this causes some operating systems to refuse to boot on such processors since PAE support is required in their kernels. Windows 8 and later also refuses to boot on these processors for the same reason, as they specifically require PAE support to run properly.
Banias/Dothan variant
- Celeron M
- Pentium M
- A100/A110
- EP80579
- CE 3100
P6 Variant Enhanced Pentium M
- SSE3 Support
- Single- and dual-core technology with 2 MB of shared L2 cache
- Increased FSB speed, with the FSB running at 533 MT/s or 667 MT/s.
- A 12-stage instruction pipeline.
Yonah variant
- Celeron M 400 series
- Core Solo/Duo
- Pentium Dual-Core T2060/T2080/T2130
- Xeon LV/ULV
Successor
- A 14-stage instruction pipeline that allows for higher clock speeds.
- SSE4.1 support for all Core 2 models manufactured at a 45 nm lithography.
- Support for the 64-bit x86-64 architecture, which was previously only offered by Prescott processors, the Pentium 4 last architectural installment.
- Increased FSB speed, ranging from 533 MT/s to 1600 MT/s.
- Increased L2 cache size, with the L2 cache size ranging from 1 MB to 12 MB.
- Dynamic Front Side Bus Throttling, where the speed of the FSB is reduced in half, which by extension reduces the processor's speed in half. Thus the processor goes to a low power consumption mode called Super Low Frequency Mode that helps extend battery life.
- Dynamic Acceleration Technology for some mobile Core 2 Duo processors, and Dual Dynamic Acceleration Technology for mobile Core 2 Quad processors. Dynamic Acceleration Technology allows the CPU to overclock one processor core while turning off the one. In Dual Dynamic Acceleration Technology two cores are deactivated and two cores are overclocked. This feature is triggered when an application only uses a single core for Core 2 Duo or up to two cores for Core 2 Quad. The overclocking is performed by increasing the clock multiplier by 1.