X86-64


x86-64 is a 64-bit extension of the x86 instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new operating modes: 64-bit mode and compatibility mode, along with a new four-level paging mechanism.
In 64-bit mode, x86-64 supports significantly larger amounts of virtual memory and physical memory compared to its 32-bit predecessors, allowing programs to utilize more memory for data storage. The architecture expands the number of general-purpose registers from 8 to 16, all fully general-purpose, and extends their width to 64 bits.
Floating-point arithmetic is supported through mandatory SSE2 instructions in 64-bit mode. While the older x87 FPU and MMX registers are still available, they are generally superseded by a set of sixteen 128-bit vector registers. Each of these vector registers can store one or two double-precision floating-point numbers, up to four single-precision floating-point numbers, or various integer formats.
In 64-bit mode, instructions are modified to support 64-bit operands and 64-bit addressing mode.
The x86-64 architecture defines a compatibility mode that allows 16-bit and 32-bit user applications to run unmodified alongside 64-bit applications, provided the 64-bit operating system supports them. Since the full x86-32 instruction sets remain implemented in hardware without the need for emulation, these older executables can run with little or no performance penalty, while newer or modified applications can take advantage of new features of the processor design to achieve performance improvements. Also, processors supporting x86-64 still power on in real mode to maintain backward compatibility with the original 8086 processor, as has been the case with x86 processors since the introduction of protected mode with the 80286.
The original specification, created by AMD and released in 2000, has been implemented by AMD, Intel, and VIA. The AMD K8 microarchitecture, in the Opteron and Athlon 64 processors, was the first to implement it. This was the first significant addition to the x86 architecture designed by a company other than Intel. Intel was forced to follow suit and introduced a modified NetBurst family which was software-compatible with AMD's specification. VIA Technologies introduced x86-64 in their VIA Isaiah architecture, with the VIA Nano.
The x86-64 architecture was quickly adopted for desktop and laptop personal computers and servers which were commonly configured for 16 GiB of memory or more. It has effectively replaced the discontinued Intel Itanium architecture, which was originally intended to replace the x86 architecture. x86-64 and Itanium are not compatible on the native instruction set level, and operating systems and applications compiled for one architecture cannot be run on the other natively.

AMD64

History

AMD64 was created as an alternative to the radically different IA-64 architecture designed by Intel and Hewlett-Packard, which was backward-incompatible with IA-32, the 32-bit version of the x86 architecture. AMD originally announced AMD64 in 1999 with a full specification available in August 2000. As AMD was never invited to be a contributing party for the IA-64 architecture and any kind of licensing seemed unlikely, the AMD64 architecture was positioned by AMD from the beginning as an evolutionary way to add 64-bit computing capabilities to the existing x86 architecture while supporting legacy 32-bit x86 code, as opposed to Intel's approach of creating an entirely new, completely x86-incompatible 64-bit architecture with IA-64.
The first AMD64-based processor, the Opteron, was released in April 2003.

Implementations

AMD's processors implementing the AMD64 architecture include Opteron, Athlon 64, Athlon 64 X2, Athlon 64 FX, Athlon II, Turion 64, Turion 64 X2, Sempron, Phenom, Phenom II, FX, Fusion/APU and Ryzen/Epyc.

Architectural features

The primary defining characteristic of AMD64 is the availability of 64-bit general-purpose processor registers, 64-bit integer arithmetic and logical operations, and 64-bit virtual addresses. The designers took the opportunity to make other improvements as well.
Notable changes in the 64-bit extensions include:
; 64-bit integer capability
; Additional registers
; Additional XMM registers
; Larger virtual address space
; Larger physical address space
; Larger physical address space in legacy mode
; Instruction pointer relative data access
; SSE instructions
; No-Execute bit
; Removal of older features

Virtual address space details

Canonical form addresses

Although virtual addresses are 64 bits wide in 64-bit mode, current implementations do not allow the entire virtual address space of 264 bytes to be used. This would be approximately four billion times the size of the virtual address space on 32-bit machines. Most operating systems and applications will not need such a large address space for the foreseeable future, so implementing such wide virtual addresses would simply increase the complexity and cost of address translation with no real benefit. AMD, therefore, decided that, in the first implementations of the architecture, only the least significant 48 bits of a virtual address would actually be used in address translation.
In addition, the AMD specification requires that the most significant 16 bits of any virtual address, bits 48 through 63, must be copies of bit 47. If this requirement is not met, the processor will raise an exception. Addresses complying with this rule are referred to as "canonical form." Canonical form addresses run from 0 through 00007FFF'FFFFFFFF, and from FFFF8000'00000000 through FFFFFFFF'FFFFFFFF, for a total of 256 TiB of usable virtual address space. This is still 65,536 times larger than the virtual 4 GiB address space of 32-bit machines.
This feature eases later scalability to true 64-bit addressing. Many operating systems take the higher-addressed half of the address space for themselves and leave the lower-addressed half for application code, user mode stacks, heaps, and other data regions. The "canonical address" design ensures that every AMD64 compliant implementation has, in effect, two memory halves: the lower half starts at 00000000'00000000 and "grows upwards" as more virtual address bits become available, while the higher half is "docked" to the top of the address space and grows downwards. Also, enforcing the "canonical form" of addresses by checking the unused address bits prevents their use by the operating system in tagged pointers as flags, privilege markers, etc., as such use could become problematic when the architecture is extended to implement more virtual address bits.
The first versions of Windows for x64 did not even use the full 256 TiB; they were restricted to just 8 TiB of user space and 8 TiB of kernel space. Windows did not support the entire 48-bit address space until Windows 8.1, which was released in October 2013.

Page table structure

The 64-bit addressing mode is a superset of Physical Address Extensions ; because of this, page sizes may be 4 KiB or 2 MiB. Long mode also supports page sizes of 1 GiB. Rather than the three-level page table system used by systems in PAE mode, systems running in long mode use four levels of page table: PAE's Page-Directory Pointer Table is extended from four entries to 512, and an additional Page-Map Level 4 Table is added, containing 512 entries in 48-bit implementations. A full mapping hierarchy of 4 KiB pages for the whole 48-bit space would take a bit more than 512 GiB of memory.
Intel has implemented a scheme with a 5-level page table, which allows Intel 64 processors to support 57-bit addresses, and in turn, a 128 PiB virtual address space. Further extensions may allow full 64-bit virtual address space and physical memory with 12-bit page table descriptors and 16- or 21-bit memory offsets for 64 KiB and 2 MiB page allocation sizes; the page table entry would be expanded to 128 bits to support additional hardware flags for page size and virtual address space size.

Operating system limits

The operating system can also limit the virtual address space. Details, where applicable, are given in the "Operating system compatibility and characteristics" section.

Physical address space details

Current AMD64 processors support a physical address space of up to 248 bytes of RAM, or 256 TiB. However,, there were no known x86-64 motherboards that support 256 TiB of RAM. The operating system may place additional limits on the amount of RAM that is usable or supported. Details on this point are given in the "Operating system compatibility and characteristics" section of this article.

Operating modes

The architecture has two primary modes of operation: long mode and legacy mode.

Long mode

Long mode is the architecture's intended primary mode of operation; it is a combination of the processor's native 64-bit mode and a combined 32-bit and 16-bit compatibility mode. It is used by 64-bit operating systems. Under a 64-bit operating system, 64-bit programs run under 64-bit mode, and 32-bit and 16-bit protected mode applications run under compatibility mode. Real-mode programs and programs that use virtual 8086 mode at any time cannot be run in long mode unless those modes are emulated in software. However, such programs may be started from an operating system running in long mode on processors supporting VT-x or AMD-V by creating a virtual processor running in the desired mode.
Since the basic instruction set is the same, there is almost no performance penalty for executing protected mode x86 code. This is unlike Intel's IA-64, where differences in the underlying instruction set mean that running 32-bit code must be done either in emulation of x86 or with a dedicated x86 coprocessor. However, on the x86-64 platform, many x86 applications could benefit from a 64-bit recompile, due to the additional registers in 64-bit code and guaranteed SSE2-based FPU support, which a compiler can use for optimization. However, applications that regularly handle integers wider than 32 bits, such as cryptographic algorithms, will need a rewrite of the code handling the huge integers in order to take advantage of the 64-bit registers.