Western Latin character sets



Several 8-bit character sets were designed for binary representation of common Western European languages, which use the Latin alphabet, a few additional letters and ones with precomposed diacritics, some punctuation, and various symbols. These character sets also happen to support many other languages such as Malay, Swahili, and Classical Latin.
''This material is technically obsolete, having been functionally replaced by Unicode. However it continues to have historical interest.''

Summary

The ISO-8859 series of 8-bit character sets encodes all Latin character sets used in Europe, albeit that the same code points have multiple uses that caused some difficulty. The arrival of Unicode, with a unique code point for every glyph, resolved these issues.

History

The earlier seven-bit U.S. American Standard Code for Information Interchange encoding has characters sufficient to properly represent only a few languages such as English, Latin, Malay and Swahili. It is missing some letters and letter-diacritic combinations used in other Latin-alphabet languages. However, since there was no other choice on most US-supplied computer platforms, use of ASCII was unavoidable except where there was a strong national computing industry. There was the ISO 646 group of encodings which replaced some of the symbols in ASCII with local characters, but space was very limited, and some of the symbols replaced were quite common in things like programming languages.
Most computers internally used eight-bit bytes but communication used seven data bits plus one parity bit. In time, it became common to use all eight bits for data, creating space for another 128 characters. In the early days most of these were system specific, but gradually the ISO/IEC 8859 standards emerged to provide some cross-platform similarity to enable information interchange.
Towards the end of the 20th century, as storage and memory costs fell, the issues associated with multiple meanings of a given eight-bit code have ceased to be justified. All major operating systems have moved to Unicode as their main internal representation. However, as Windows did not support the UTF-8 method of encoding Unicode, many applications continued to be restricted to these legacy character sets.

The euro sign

The introduction of the euro and its associated euro sign introduced significant pressure on computer systems developers to support this new symbol, and most 8-bit character sets had to be adapted in some way.
Whilst these decisions had limited effect for documents that were only used within a single computer, it meant that documents containing a euro sign would fail to render as expected when interchanged between ecosystems.
All of these issues have been resolved as operating systems have been upgraded to support Unicode as standard, which encodes the euro sign at U+20AC.

Comparison table

Code points to U+007F are not shown in this table currently, as they are directly mapped in all character sets listed here. The ASCII coding standard defines the original specification for the mapping of the first 0-127 characters.
The table is arranged by Unicode code point. Character sets are referred to here by their IANA names in upper case.
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
NBSPU+00A0A0A0A0FFFFCA
¡U+00A1A1A1A1ADADC1
¢U+00A2A2A2A29BBDA2
£U+00A3A3A3A39C9CA3
¤U+00A4A4A4CF
¥U+00A5A5A5A59DBEB4
¦U+00A6A6A6DD
§U+00A7A7A7A7F5A4
¨U+00A8A8A8F9AC
©U+00A9A9A9A9B8A9
ªU+00AAAAAAAAA6A6BB
«U+00ABABABABAEAEC7
¬U+00ACACACACAAAAC2
SHYU+00ADADADADF0
®U+00AEAEAEAEA9A8
¯U+00AFAFAFAFEEF8
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
°U+00B0B0B0B0F8F8A1
±U+00B1B1B1B1F1F1B1
²U+00B2B2B2B2FDFD
³U+00B3B3B3B3FC
´U+00B4B4B4EFAB
µU+00B5B5B5B5E6E6B5
U+00B6B6B6B6F4A6
·U+00B7B7B7B7FAFAE1
¸U+00B8B8B8F7FC
¹U+00B9B9B9B9FB
ºU+00BABABABAA7A7BC
»U+00BBBBBBBBAFAFC8
¼U+00BCBCBCACAC
½U+00BDBDBDABAB
¾U+00BEBEBEF3
¿U+00BFBFBFBFA8A8C0
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
ÀU+00C0C0C0C0B7CB
ÁU+00C1C1C1C1B5E7
ÂU+00C2C2C2C2B6E5
ÃU+00C3C3C3C3C7CC
ÄU+00C4C4C4C48E8E80
ÅU+00C5C5C5C58F8F81
ÆU+00C6C6C6C69292AE
ÇU+00C7C7C7C7808082
ÈU+00C8C8C8C8D4E9
ÉU+00C9C9C9C9909083
ÊU+00CACACACAD2E6
ËU+00CBCBCBCBD3E8
ÌU+00CCCCCCCCDEED
ÍU+00CDCDCDCDD6EA
ÎU+00CECECECED7EB
ÏU+00CFCFCFCFD8EC
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
ÐU+00D0D0D0D0D1
ÑU+00D1D1D1D1A5A584
ÒU+00D2D2D2D2E3F1
ÓU+00D3D3D3D3E0EE
ÔU+00D4D4D4D4E2EF
ÕU+00D5D5D5D5E5CD
ÖU+00D6D6D6D6999985
×U+00D7D7D7D79E
ØU+00D8D8D8D89DAF
ÙU+00D9D9D9D9EBF4
ÚU+00DADADADAE9F2
ÛU+00DBDBDBDBEAF3
ÜU+00DCDCDCDC9A9A86
ÝU+00DDDDDDDDED
ÞU+00DEDEDEDEE8
ßU+00DFDFDFDFE1E1A7
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
àU+00E0E0E0E0858588
áU+00E1E1E1E1A0A087
âU+00E2E2E2E2838389
ãU+00E3E3E3E3C68B
äU+00E4E4E4E484848A
åU+00E5E5E5E586868C
æU+00E6E6E6E69191BE
çU+00E7E7E7E787878D
èU+00E8E8E8E88A8A8F
éU+00E9E9E9E982828E
êU+00EAEAEAEA888890
ëU+00EBEBEBEB898991
ìU+00ECECECEC8D8D93
íU+00EDEDEDEDA1A192
îU+00EEEEEEEE8C8C94
ïU+00EFEFEFEF8B8B95
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
ðU+00F0F0F0F0D0
ñU+00F1F1F1F1A4A496
òU+00F2F2F2F2959598
óU+00F3F3F3F3A2A297
ôU+00F4F4F4F4939399
õU+00F5F5F5F5E49B
öU+00F6F6F6F694949A
÷U+00F7F7F7F7F6F6D6
øU+00F8F8F8F89BBF
ùU+00F9F9F9F997979D
úU+00FAFAFAFAA3A39C
ûU+00FBFBFBFB96969E
üU+00FCFCFCFC81819F
ýU+00FDFDFDFDEC
þU+00FEFEFEFEE7
ÿU+00FFFFFFFF9898D8
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
ıU+0131D5F5
ŒU+0152BC8CCE
œU+0153BD9CCF
ŠU+0160A68A
šU+0161A89A
ŸU+0178BE9FD9
ŽU+017DB48E
žU+017EB89E
ƒU+0192839F9FC4
ˆU+02C688F6
ˇU+02C7FF
˘U+02D8F9
˙U+02D9FA
˚U+02DAFB
˛U+02DBFE
˜U+02DC98F7
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
˝U+02DDFD
ΓU+0393E2
ΘU+0398E9
ΣU+03A3E4
ΦU+03A6E8
ΩU+03A9EABD
αU+03B1E0
δU+03B4EB
εU+03B5EE
πU+03C0E3B9
σU+03C3E5
τU+03C4E7
φU+03C6ED
-U+201396D0
U+201497D1
U+2017F2
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
U+201891D4
U+201992D5
U+201A82E2
U+201C93D2
U+201D94D3
U+201E84E3
U+202086A0
U+202187E0
U+202295A5
U+202685C9
U+203089E4
U+20398BDC
U+203A9BDD
U+2044DA
U+207FFC
U+20A79E
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
U+20ACA480DB
U+212299AA
U+2202B6
U+2206C6
U+220FB8
U+2211B7
U+2219F9
U+221AFBC3
U+221EECB0
U+2229EF
U+222BBA
U+2248F7C5
U+2260AD
U+2261F0
U+2264F3B2
U+2265F2B3
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
U+2310A9
U+2320F4
U+2321F5
U+2500C4C4
U+2502B3B3
U+250CDADA
U+2510BFBF
U+2514C0C0
U+2518D9D9
U+251CC3C3
U+2524B4B4
U+252CC2C2
U+2534C1C1
U+253CC5C5
U+2550CDCD
U+2551BABA
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
U+2552D5
U+2553D6
U+2554C9C9
U+2555B8
U+2556B7
U+2557BBBB
U+2558D4
U+2559D3
U+255AC8C8
U+255BBE
U+255CBD
U+255DBCBC
U+255EC6
U+255FC7
U+2560CCCC
U+2561B5
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
U+2562B6
U+2563B9B9
U+2564D1
U+2565D2
U+2566CBCB
U+2567CF
U+2568D0
U+2569CACA
U+256AD8
U+256BD7
U+256CCECE
U+2580DFDF
U+2584DCDC
U+2588DBDB
U+258CDD
U+2590DE
CharacterCode pointISO-8859-1ISO-8859-15WINDOWS-1252IBM437IBM850MACINTOSH
U+2591B0B0
U+2592B1B1
U+2593B2B2
U+25A0FEFE
U+25CAD7
U+FB01DE
U+FB02DF

  • The mappings for the IBM code pages are from the Unicode site supplied by Microsoft. The Unicode Consortium's document has links to sources giving the differences between IBM's and Microsoft's mappings for these code pages.
  • IBM437 and IBM850 defined printable characters for the control code ranges. While these could not be used when printing text through DOS, as they would be trapped before reaching the screen, they could be used by applications that used screen memory directly.
  • Macintosh has an Apple logo at 0xF0, and translates it to U+F8FF in the Private Use Area for Unicode.