6b/8b encoding


In telecommunications, 6b/8b is a line code that expands 6-bit codes to 8-bit symbols for the purposes of maintaining DC-balance in a communications system.
The 6b/8b encoding is a balanced code --
each 8-bit output symbol contains 4 zero bits and 4 one bits. So the code can, like a parity bit, detect all single-bit errors.
The number of 8-bit patterns with 4 bits set is the binomial coefficient = 70. Further excluding the patterns 11110000 and 00001111, this allows 68 coded patterns: 64 data codes, plus 4 additional control codes.

Coding rules

The 64 possible 6-bit input codes can be classified according to their disparity, the number of 1 bits minus the number of 0 bits:
OnesZerosDisparityNumber
06−61
15−46
24−215
33020
42+215
51+46
60+61

The 6-bit input codes are mapped to 8-bit output symbols as follows:
  • The 20 6-bit codes with disparity 0 are prefixed with 10
    Example: 000111 → 10000111
    Example: 101010 → 10101010
  • The 15 6-bit codes with disparity +2, other than 001111, are prefixed with 00
    Example: 010111 → 00010111
  • The 15 6-bit codes with disparity −2, other than 110000, are prefixed with 11
    Example: 101000 → 11101000
  • The remaining 20 codes: 12 with disparity ±4, 2 with disparity ±6, 001111, 110000, and the 4 control codes, are assigned to codes beginning with 01 as follows:
TypeInputOutputTypeInputOutputComplement
−600000001011001+61111110110011001_xx__x
−400000101110001+41111100100111001xx____
−400001001110010+41111010100110101xx____
−400010001100101+41110110101101001x____x
−400100001101001+41101110101011001x____x
−401000001010011+41011110110110001_____xx
−410000001100011+40111110101110001_____xx
−211000001110100+20011110100101101____x__
ControlK 00011101000111ControlK 11100001111000
ControlK 01010101010101ControlK 10101001101010

No data symbol contains more than four consecutive matching bits, and because the patterns 11110000 and 00001111 are excluded, no data symbol begins or ends with more than three identical bits.
Thus, the longest run of identical bits that will be produced is 6.
Any occurrence of 6 consecutive identical bits constitutes a comma sequence or sync mark or syncword; it identifies the symbol boundaries precisely.
Those 6 bits straddle the inter-symbol boundary with exactly 3 of those identical bits at the end of one symbol, and 3 of those identical bits at the start of the following next symbol.