3DNow!
3DNow! is a deprecated extension to the x86 instruction set developed by Advanced Micro Devices. It adds single instruction multiple data instructions to the base x86 instruction set, enabling it to perform vector processing of floating-point vector operations using vector registers. This improvement enhances the performance of many graphics-intensive applications. The first microprocessor to implement 3DNow! was the AMD K6-2, introduced in 1998. In appropriate applications, this enhancement raised the speed by about 2–4 times.
However, the instruction set never gained much popularity, and AMD announced in August 2010 that support for 3DNow! would be dropped in future AMD processors, except for two instructions,
PREFETCH and PREFETCHW. These two instructions are also available in Bay-Trail Intel processors.History
3DNow! was developed at a time when 3D graphics were becoming mainstream in PC multimedia and games. Realtime display of 3D graphics depended heavily on the host CPU's floating-point unit to perform floating-point calculations, a task in which AMD's K6 processor was easily outperformed by its competitor, the Intel Pentium II.As an enhancement to the MMX instruction set, the 3DNow! instruction-set augmented the MMX SIMD registers to support common arithmetic operations on single-precision floating-point data. Software written to use AMD's 3DNow! instead of the slower x87 FPU could execute up to four times faster, depending on the instruction mix.
Versions
3DNow!
The first implementation of 3DNow! technology contains 21 new instructions that support SIMD floating-point operations. The 3DNow! data format is packed, single-precision, floating-point. The 3DNow! instruction set also includes operations for SIMD integer operations, data prefetch, and faster MMX-to-floating-point switching. Later, Intel would add similar instructions to the Pentium III, known as SSE.3DNow! floating-point instructions are the following:
-
PI2FDPacked 32-bit integer to floating-point conversion -
PF2IDPacked floating-point to 32-bit integer conversion -
PFCMPGEPacked floating-point comparison, greater or equal -
PFCMPGTPacked floating-point comparison, greater -
PFCMPEQPacked floating-point comparison, equal -
PFACCPacked floating-point accumulate -
PFADDPacked floating-point addition -
PFSUBPacked floating-point subtraction -
PFSUBRPacked floating-point reverse subtraction -
PFMINPacked floating-point minimum -
PFMAXPacked floating-point maximum -
PFMULPacked floating-point multiplication -
PFRCPPacked floating-point reciprocal approximation -
PFRSQRTPacked floating-point reciprocal square root approximation -
PFRCPIT1Packed floating-point reciprocal, first iteration step -
PFRSQIT1Packed floating-point reciprocal square root, first iteration step -
PFRCPIT2Packed floating-point reciprocal/reciprocal square root, second iteration step
-
PAVGUSBPacked 8-bit unsigned integer averaging -
PMULHRWPacked 16-bit integer multiply with rounding
-
FEMMSFaster entry/exit of the MMX or floating-point state -
PREFETCH/PREFETCHWPrefetch at least a 32-byte line into L1 data cache
3DNow! extensions
There is little or no evidence that the second version of 3DNow! was ever officially given its own trade name. This has led to some confusion in documentation that refers to this new instruction set. The most common terms are Extended 3DNow!, Enhanced 3DNow! and 3DNow!+. The phrase "Enhanced 3DNow!" can be found in a few locations on the AMD website but the capitalization of "Enhanced" appears to be either purely grammatical or used for emphasis on processors that may or may not have these extensions.This extension to the 3DNow! instruction set was introduced with the first-generation Athlon processors. The Athlon added five new 3DNow! instructions and 19 new MMX instructions. Later, the K6-2+ and K6-III+ included the five new 3DNow! instructions, leaving out the 19 new MMX instructions. The new 3DNow! instructions were added to boost DSP. The new MMX instructions were added to boost streaming media.
The 19 new MMX instructions are a subset of Intel's SSE instruction set. In AMD technical manuals, AMD segregates these instructions apart from the 3DNow! extensions. In AMD customer product literature, however, this segregation is less clear where the benefits of all 24 new instructions are credited to enhanced 3DNow! technology. This has led programmers to come up with their own name for the 19 new MMX instructions. The most common appears to be Integer SSE. SSEMMX and MMX2 are also found in video filter documentation from the public domain sector. ISSE could also refer to Internet SSE, an early name for SSE.
3DNow! extension DSP instructions are the following:
-
PF2IWPacked floating-point to integer word conversion with sign extend -
PI2FWPacked integer word to floating-point conversion -
PFNACCPacked floating-point negative accumulate -
PFPNACCPacked floating-point mixed positive-negative accumulate -
PSWAPDPacked swap doubleword
-
MASKMOVQStreaming store using byte mask -
MOVNTQStreaming store -
PAVGBPacked average of unsigned byte -
PAVGWPacked average of unsigned word -
PMAXSWPacked maximum signed word -
PMAXUBPacked maximum unsigned byte -
PMINSWPacked minimum signed word -
PMINUBPacked minimum unsigned byte -
PMULHUWPacked multiply high unsigned word -
PSADBWPacked sum of absolute byte differences -
PSHUFWPacked shuffle word -
PEXTRWExtract word into integer register -
PINSRWInsert word from integer register -
PMOVMSKBMove byte mask to integer register -
PREFETCHNTAPrefetch using the NTA reference -
PREFETCHT0Prefetch using the T0 reference -
PREFETCHT1Prefetch using the T1 reference -
PREFETCHT2Prefetch using the T2 reference -
SFENCEStore fence
3DNow! Professional
3DNow! Professional is a trade name used to indicate processors that combine 3DNow! technology with a complete SSE instructions set. The Athlon XP was the first processor to carry the 3DNow! Professional trade name, and was the first product in the Athlon family to support the complete SSE instruction set.3DNow! and the Geode GX/LX
The Geode GX and Geode LX added two new 3DNow! instructions which is absent in all other processors.3DNow! "professional" instructions unique to the Geode GX/LX are the following:
-
PFRSQRTVReciprocal square root approximation for a pair of 32-bit floats -
PFRCPVReciprocal approximation for a pair of 32-bit floats
Advantages and disadvantages
One advantage of 3DNow! is that it is possible to add or multiply the two numbers that are stored in the same register. With SSE, each number can only be combined with a number in the same position in another register. This capability, known as horizontal in Intel terminology, was the major addition to the SSE3 instruction set.A disadvantage with 3DNow! is that 3DNow! instructions and MMX instructions share the same register-file, whereas SSE adds 8 new independent registers.
Because MMX/3DNow! registers are shared by the standard x87 FPU, 3DNow! instructions and x87 instructions cannot be executed simultaneously. However, because it is aliased to the x87 FPU, the 3DNow! and MMX register states can be saved and restored by the traditional x87
FSAVE and FRSTOR instructions. This arrangement allowed operating systems to support 3DNow! with no explicit modifications, whereas SSE registers required explicit operating system support to properly save and restore the new XMM registers The FX* instructions from SSE provide a functional superset of the older x87 save and restore instructions. They can save not only SSE register states but also the x87 register states.
On AMD Athlon XP and K8-based cores, assembly programmers have noted that it is possible to combine 3DNow! and SSE instructions to reduce register pressure, but in practice it is difficult to improve performance due to the instructions executing on shared functional units.
Processors supporting 3DNow!
- All AMD processors after K6-2, Athlon, Athlon 64 and Phenom architecture families.
- * Not supported in Bulldozer, Bobcat and Zen architecture processors and their derivates.
- * The last AMD APU processor supporting 3DNow! is the A8-3870K, which is based on the Llano architecture. It is also the only APU with 3DNow! instructions, as the Bobcat and up exclude support for it.
- National Semiconductor Geode GX2, later AMD Geode.
- VIA C3 "Samuel", "Samuel 2", "Ezra", and "Eden ESP" cores.
- IDT WinChip 2, 3