ISO basic Latin alphabet
The ISO basic Latin alphabet is an international standard for a Latin-script alphabet that consists of two sets of 26 letters, codified in various national and international standards and used widely in international communication. They are the same letters that comprise the current English alphabet. Since medieval times, they are also the same letters of the modern Latin alphabet. The order is also important for sorting words into alphabetical order.
The two sets contain the following 26 letters each:
| Uppercase letter set | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
| Lowercase letter set | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z |
History
By the 1960s it became apparent to the computer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization encapsulated the Latin script in their 7-bit character-encoding standard. To achieve widespread acceptance, this encapsulation was based on popular usage. The standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 × 2 letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 8859 and ISO/IEC 10646, have continued to define the 26 × 2 letters of the English alphabet as the basic Latin script with extensions to handle other letters in other languages.Terminology
The Unicode block that contains the alphabet is called "C0 Controls and Basic Latin". Two subheadings exist:- "Uppercase Latin alphabet": the letters start at U+0041 and contain the string LATIN CAPITAL LETTER in their descriptions
- "Lowercase Latin alphabet": the letters start at U+0061 and contain the string LATIN SMALL LETTER in their descriptions
- Uppercase: the letters start at U+FF21 and contain the string FULLWIDTH LATIN CAPITAL LETTER in their descriptions
- Lowercase: the letters start at U+FF41 and contain the string FULLWIDTH LATIN SMALL LETTER in their descriptions
Timeline for encoding standards
- 1865 International Morse Code was standardized at the International Telegraphy Congress in Paris, and was later made the standard by the International Telecommunication Union
- 1950s Radiotelephony Spelling Alphabet by ICAO
Timeline for widely used computer codes supporting the alphabet
- 1963: ASCII
- 1963/1964: EBCDIC
- 1965-04-30: Ratified by ECMA as ECMA-6 based on work the ECMA's Technical Committee TC1 had carried out since December 1960.
- 1972: ISO 646
- 1983: ITU-T Rec. T.51 | ISO/IEC 6937
- 1987: ISO/IEC 8859-1:1987
- * Subsequently, other versions and parts of ISO/IEC 8859 have been published.
- Mid-to-late 1980s: Windows-1250, Windows-1252, and other encodings used in Microsoft Windows
- 1990: Unicode 1.0, contained in the block "C0 Controls and Basic Latin" using the same alphabetic code values as ASCII and ISO/IEC 646
- * Subsequently, other versions of Unicode have been published and it later became a joint ISO/IEC standard as well, as identified below.
- 1993: ISO/IEC 10646-1:1993, ISO/IEC standard for characters in Unicode 1.1
- * Subsequently, other versions of ISO/IEC 10646-1 and one of ISO/IEC 10646-2 have been published. Since 2003, the standards have been published under the name "ISO/IEC 10646" without the separation into two parts.
- 1997: Windows Glyph List 4
Representation
Not case sensitive, all letters have code words in the ICAO spelling alphabet and can be represented with Morse code.
Usage
All of the lowercase letters are used in the International Phonetic Alphabet. In X-SAMPA and SAMPA these letters have the same sound value as in IPA.Alphabets containing the same set of letters
The list below only includes alphabets that include all the 26 letters but exclude:- letters whose diacritical marks make them distinct letters.
- multigraphs that constitute distinct letters.
- ligatures that are distinct letters.
| Alphabet | Diacritic | Multigraphs | Ligatures |
| Afrikaans alphabet | á, ä, é, è, ê, ë, í, î, ï, ó, ô, ö, ú, û, ü, ý | Digraphs: ⟨aa⟩, ⟨ai⟩, ⟨ch⟩, ⟨ee⟩, ⟨ei⟩, ⟨eu⟩, ⟨gh⟩, ⟨ie⟩, ⟨nj⟩, ⟨ng⟩ ⟨oe⟩, ⟨oi⟩, ⟨oo⟩, ⟨ou⟩, ⟨sj⟩, ⟨tj⟩, ⟨ts⟩, ⟨ui⟩, ⟨uu⟩ Trigraphs: ⟨aai⟩, ⟨eeu⟩, ⟨oei⟩, ⟨ooi⟩ | ʼn |
| Aragonese alphabet | á, é, í, ó, ú, ü, lꞏl | ⟨ch⟩, ⟨gu⟩, ⟨ll⟩, ⟨ny⟩, ⟨qu⟩, ⟨rr⟩, ⟨tz⟩ | |
| Catalan alphabet | à, é, è, í, ï, ó, ò, ú, ü, ç, lꞏl | ⟨gu⟩, ⟨ig⟩, ⟨ix⟩, ⟨ll⟩, ⟨ny⟩, ⟨qu⟩, ⟨rr⟩, ⟨ss⟩ | |
| Dutch alphabet | ä, é, è, ë, ï, ö, ü | The digraph ⟨ij⟩ is sometimes considered to be a separate letter. When that is the case, it usually replaces or is intermixed with ⟨y⟩. Other digraphs: ⟨aa⟩, ⟨ae⟩, ⟨ai⟩, ⟨au⟩, ⟨ch⟩, ⟨ee⟩, ⟨ei⟩, ⟨eu⟩, ⟨ie⟩, ⟨oe⟩, ⟨oi⟩, ⟨oo⟩, ⟨ou⟩, ⟨ui⟩, ⟨uu⟩ | |
| English alphabet | ⟨sh⟩, ⟨ch⟩, ⟨ea⟩, ⟨ou⟩, ⟨th⟩, ⟨ph⟩, ⟨ng⟩ | æ, œ | |
| French alphabet | à, â, ç, é, è, ê, ë, î, ï, ô, ù, û, ü, ÿ | ⟨ai⟩, ⟨au⟩, ⟨ei⟩, ⟨eu⟩, ⟨oi⟩, ⟨ou⟩, ⟨eau⟩, ⟨ch⟩, ⟨ph⟩, ⟨gn⟩, ⟨an⟩, ⟨am⟩, ⟨en⟩, ⟨em⟩, ⟨in⟩, ⟨im⟩, ⟨on⟩, ⟨om⟩, ⟨un⟩, ⟨um⟩, ⟨yn⟩, ⟨ym⟩, ⟨ain⟩, ⟨aim⟩, ⟨ein⟩, ⟨oin⟩, ⟨aî⟩, ⟨eî⟩ | æ, œ |
| Hmong Latin alphabet | ⟨bh⟩, ⟨bl⟩, ⟨ch⟩, ⟨dh⟩, ⟨dl⟩, ⟨gh⟩, ⟨hl⟩, ⟨hm⟩, ⟨hn⟩, ⟨jh⟩, ⟨kh⟩, ⟨ml⟩, ⟨nc⟩, ⟨nq⟩, ⟨nr⟩, ⟨nt⟩, ⟨nx⟩, ⟨ny⟩, ⟨ph⟩, ⟨pl⟩, ⟨qh⟩, ⟨rh⟩, ⟨th⟩, ⟨ts⟩, ⟨tx⟩, ⟨xy⟩, ⟨bhl⟩, ⟨dhl⟩, ⟨hml⟩, ⟨hny⟩, ⟨nch⟩, ⟨ndl⟩, ⟨ngh⟩, ⟨nrh⟩, ⟨nth⟩, ⟨nxh⟩, ⟨phl⟩, ⟨tsh⟩, ⟨txh⟩, ⟨ndhl⟩ | ||
| Italian alphabet | à, è, é, ì, î, ò, ó, ù | ⟨ch⟩, ⟨ci⟩, ⟨gh⟩, ⟨gi⟩, ⟨gl⟩, ⟨gli⟩, ⟨gn⟩, ⟨sc⟩, ⟨sci⟩ | |
| Ido alphabet* | ⟨qu⟩, ⟨ch⟩, ⟨sh⟩ | - | |
| Indonesian alphabet | ⟨kh⟩, ⟨ng⟩, ⟨ny⟩, ⟨sy⟩, diphthongs: ⟨ai⟩, ⟨au⟩, ⟨ei⟩, ⟨oi⟩ | ||
| Interlingua alphabet* | ⟨ch⟩, ⟨ph⟩, ⟨qu⟩, ⟨rh⟩, ⟨sh⟩ | - | |
| Javanese Latin alphabet | é, è | ⟨dh⟩, ⟨kh⟩, ⟨ng⟩, ⟨ny⟩, ⟨sy⟩, ⟨th⟩ | |
| Latino sine flexione alphabet* | ⟨ae⟩, ⟨ch⟩, ⟨oe⟩, ⟨ph⟩, ⟨qu⟩, ⟨rh⟩, ⟨th⟩ | ||
| Luxembourgish alphabet | ä, é, ë | ⟨aa⟩, ⟨ch⟩, ⟨ck⟩, ⟨ee⟩, ⟨ei⟩, ⟨ie⟩, ⟨ii⟩, ⟨ng⟩, ⟨oo⟩, ⟨ou⟩, ⟨qu⟩, ⟨ue⟩, ⟨uu⟩, ⟨sch⟩ | |
| Malay alphabet | ⟨gh⟩, ⟨kh⟩, ⟨ng⟩, ⟨ny⟩, ⟨sy⟩ | ||
| Portuguese alphabet | ã, õ, á, é, í, ó, ú, â, ê, ô, à, ç | ⟨ch⟩, ⟨lh⟩, ⟨nh⟩, ⟨rr⟩, ⟨ss⟩, ⟨am⟩, ⟨em⟩, ⟨im⟩, ⟨om⟩, ⟨um⟩, ⟨ãe⟩, ⟨ão⟩, ⟨õe⟩ | |
| Sundanese Latin alphabet | é | ⟨eu⟩, ⟨ng⟩, ⟨ny⟩ | |
| Xhosa alphabet | ⟨bh⟩, ⟨ch⟩, ⟨dl⟩, ⟨dy⟩, ⟨dz⟩, ⟨gc⟩, ⟨gq⟩, ⟨gr⟩, ⟨gx⟩, ⟨hh⟩, ⟨hl⟩, ⟨kh⟩, ⟨kr⟩, ⟨krh⟩, ⟨lh⟩, ⟨mh⟩, ⟨nc⟩, ⟨ng⟩, ⟨ngʼ⟩, ⟨ngc⟩, ⟨ngh⟩, ⟨ngq⟩, ⟨ngx⟩, ⟨nh⟩, ⟨nkc⟩, ⟨nkq⟩, ⟨nkx⟩, ⟨nq⟩, ⟨nx⟩, ⟨ny⟩, ⟨nyh⟩, ⟨ph⟩, ⟨qh⟩, ⟨rh⟩, ⟨sh⟩, ⟨th⟩, ⟨ths⟩, ⟨thsh⟩, ⟨ts⟩, ⟨tsh⟩, ⟨ty⟩, ⟨tyh⟩, ⟨wh⟩, ⟨xh⟩, ⟨yh⟩, ⟨zh⟩ | ||
| Zulu alphabet | ⟨bh⟩, ⟨ch⟩, ⟨dl⟩, ⟨dy⟩, ⟨gc⟩, ⟨gq⟩, ⟨gx⟩, ⟨hh⟩, ⟨hl⟩, ⟨kh⟩, ⟨kl⟩, ⟨mb⟩, ⟨nc⟩, ⟨ng⟩, ⟨ngc⟩, ⟨ngq⟩, ⟨ngx⟩, ⟨nj⟩, ⟨nk⟩, ⟨nq⟩, ⟨ntsh⟩, ⟨nx⟩, ⟨ny⟩, ⟨ph⟩, ⟨qh⟩, ⟨sh⟩, ⟨th⟩, ⟨ts⟩, ⟨tsh⟩, ⟨xh⟩ | - |
- English is one of the few modern European languages requiring no diacritics for native words.
- Interlingua, a constructed language, never uses diacritics except in unassimilated loanwords. However, they can be removed if they are not used to modify the vowel.
- Latino sine flexione, a.k.a. "Peano's Interlingua", allows but does not require the placement of an accent for unusual stress.
- Malay and Indonesian use all the Latin alphabet and require no diacritics and ligatures. However, Malay and Indonesian learning materials may use ⟨é⟩ to clarify the pronunciation of the letter E; in that case, ⟨e⟩ is pronounced /ə/ while ⟨é⟩ is pronounced /e/ and is pronounced /ɛ/. Many of the 700+ languages of Indonesia also use the Indonesian alphabet to write their languages, some—such as Javanese—adding diacritics é and è, and some omitting q, x, and z.
- Xhosa is usually written without diacritics, but may optionally use diacritics over for tones:.
Column numbering
The letters are often used for indexing nested bullet points. In this case after the 26th it is more common to use AA, BB, CC,... instead of base-26 numbers.