Decimal64 floating-point format
In computing, decimal64 is a decimal floating-point computer number format that occupies 8 bytes in computer memory. The format was formally introduced in the 2008 revision of the IEEE 754 standard, also known as ISO/IEC/IEEE 60559:2011.
Format
Decimal64 values are categorized as normal or subnormal (denormal) numbers and can be encoded in either binary integer decimal or densely packed decimal formats. Normal values can possess 16-digit precision ranging from ±1.000000000000000× to ±9.999999999999999×. In addition to normal and subnormal numbers, the format also includes signed zeros, infinities, and NaNs.The binary format of identical size accommodates a spectrum from denormal-min ±5×, through normal-min with complete 53-bit precision ±2.2250738585072014×, to maximum ±1.7976931348623157×.
Because the significand for the IEEE 754, decimal formats are not normalized and most values with less than 16 significant digits have multiple possible representations; 1000000 × 10−2 = 100000 × 10−1 = 10000 × 100 = 1000 × 101 all have the value 10000. These sets of representations for the same value are called cohorts. The different members can be used to denote how many digits of the value are known precisely. Each signed zero has 768 possible representations.
Encoding of decimal64 values
| Sign | Combination | Significand continuation |
| 1 bit | 13 bits | 50 bits |
IEEE 754 allows two alternative encodings for decimal64 values. The standard does not specify how to signify which representation is used. For instance, in a situation where decimal64 values are communicated between systems:
- In the #Binary integer [significand field|binary encoding], the 16-digit significand is represented as a binary coded positive integer, based on binary integer decimal.
- In the #Densely [packed decimal significand field|decimal encoding], the 16-digit significand is represented as a decimal coded positive integer, based on densely packed decimal with 5 groups of 3 digits each represented in declets, 10-bit sequences. The most significant bit is encoded separately. This is efficient because 210 = 1024, is only slightly more than needed to contain all integers from 0 to 999.
In both cases, the most significant 4 bits of the significand, which only have 10 possible values, are combined with two bits of the exponent to use 30 of the 32 possible values of a 5-bit field. The remaining combinations encode infinities and NaNs. BID and DPD use different bits of the combination field.
For Infinity and NaN, all other bits of the encoding are not used. Thus, an array can be filled with a single byte value to set it to Infinities or NaNs.
Binary integer significand field
This format uses a binary significand from 0 to The encoding, completely stored on 64 bits, can represent binary significands up to but values larger than are illegal and the standard requires implementations to treat them as 0 if encountered on input.As described above, the encoding varies depending on whether the most significant of the significand are in the range 0 to 7, or higher.
If the two bits after the sign bit are "00", "01", or "10", then the exponent field consists of the following the sign bit and the significand is the remaining with an implicit leading. This includes subnormal numbers where the leading significand digit is 0.
If the after the sign bit are "11", then the 10-bit exponent field is shifted to the right and the represented significand is in the remaining. In this case there is an implicit leading 3-bit sequence "100" for the MSB bits of the true significand.
The leading bits of the significand field do not encode the most significant decimal digit; they are simply part of a larger pure-binary number. For example, a significand of is encoded as binary 2 with the leading encoding 7; the first significand which requires a 54th bit is The highest valid significant is whose [|binary encoding] is 2.
In the above cases, the value represented is
If the four bits after the sign bit are "1111" then the value is an infinity or a NaN, as described above:
0 11110 xx...x +infinity
1 11110 xx...x -infinity
x 11111 0x...x a quiet NaN
x 11111 1x...x a signalling NaN
Densely packed decimal significand field
In this version, the significand is stored as a series of decimal digits. The leading digit is between 0 and 9 and the rest of the significand uses the densely packed decimal encoding.The leading of the exponent and the leading digit of the significand are combined into the five bits that follow the sign bit. The eight bits after that are the exponent continuation field, providing the less-significant bits of the exponent. The last are the significand continuation field, consisting of five 10-bit declets. Each declet encodes three decimal digits using the DPD encoding.
If the initial two bits following the sign bit are "00", "01", or "10", they represent the leading bits of the exponent, whereas the subsequent three bits "cde" are regarded as the leading decimal digit :
If the first two bits after the sign bit are "11", then the second 2-bits are the leading bits of the exponent, and the next bit "e" is prefixed with implicit bits "100" to form the leading decimal digit :
The remaining two combinations of the 5-bit field after the sign bit are used to represent ±infinity and NaNs, respectively.
The DPD/3BCD transcoding for the declets is given by the following table. b9...b0 are the bits of the DPD, and d2...d0 are the three BCD digits.
The 8 decimal values whose digits are all 8s or 9s have four codings each. The bits marked x in the table above are ignored on input, but will always be 0 in computed results. The non-standard encodings fill in the gap between
In the above cases, with the true significand as the sequence of decimal digits decoded, the value represented is
Implementations
There are multiple libraries available for calculations with decimal data types. Some examples are listed below:- Libdfp implements the ISO/IEC Technical Report "ISO/IEC TR 24732". The library implements math functions for environments built on gcc and glibc. It uses BID-encoded IEEE 754 decimal types, with the possibility to switch to Decimal64 floating-point format via recompilation of gcc. The library can be slow, and sometimes inaccurate, for conversions and complicated operations.
- Intel Decimal Floating-Point Math Library is a C library that implements the IEEE 754-2008 Decimal Floating-Point Arithmetic specification by addressing decimal types as
UINTxxitems nameddecimalxx. The library improves performance by calculating more intricate functions in binary where possible, but this introduces some binary rounding errors into decimal calculations. - decNumber provides data types and functions to calculate values using Decimal Floating-Point with arbitrary precision. It features the option to set the IEEE 754-compatible parameters
decSingle,decDouble, anddecQuad. It efficiently performs conversions between decimal numbers and strings. - Mpdecimal is an implementation of the General Decimal Arithmetic Specification proposed by Mike Cowlishaw, with the option to set IEEE type-compatible parameters. It provides arbitrary precision, using an 8-bit integer to encode runtime flags and the number’s sign. The library represents numbers using seven or more double- or quad-words, storing the signed exponent, the digits, the length in words, the allocated words, a pointer to
data, anddataitself.Datais stored as an array of two or more words, which holds the coefficient in 32-bit or 64-bit items. The library can handle a wide range of inputs and produces accurate results. It is implemented in Python as thedecimalmodule.