Unicode subscripts and superscripts
has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.
The World Wide Web Consortium and the Unicode Consortium have made recommendations on the choice between using markup and using superscript and subscript characters:
When used in mathematical context it is recommended to consistently use style markup for superscripts and subscripts However, when super and sub-scripts are to reflect semantic distinctions, it is easier to work with these meanings encoded in text rather than markup, for example, in phonetic or phonemic transcription.
Uses
The intended use when these characters were added to Unicode was to produce true superscripts and subscripts so that chemical and algebraic formulas could be written without markup. Thus is supposed to be identical to .In reality, many fonts that include these characters ignore the Unicode definition, and instead design the digits for mathematical numerator and denominator glyphs, which are aligned with the cap line and the baseline, respectively. When used with the solidus or the Fraction Slash, they produce an almost typographically correct diagonal fraction, such as for the glyph. Super and subscript markup does not produce a correct fraction. The change also makes the superscript letters useful for ordinal indicators, more closely matching the ª and º characters.
Unicode intended that diagonal fractions be rendered by a different mechanism: the fraction slash U+2044 is visually similar to the solidus, but when used with the ordinary digits, it instructs the layout system that a fraction such as is to be rendered using automatic glyph substitution. User-end support was quite poor for a number of years, but fonts, browsers, word processors, desktop publishing software and others increasingly support the intended Unicode behavior. This browser and your default font render the sequence as.
Superscripts and subscripts block
The most common superscript digits were included in ISO-8859-1 and were therefore carried over into those code points in the Latin-1 range of Unicode. The remainder were placed along with basic arithmetical symbols, and later some Latin subscripts, in a dedicated block at to U+209F. The table below shows these characters together. Each superscript or subscript character is preceded by a baseline to show the height of subscripting/superscripting.Six code points in the "Superscripts and Subscripts" block are unassigned, and remain available for future characters. three of these were provisionally assigned to new subscript characters, namely Latin lowercase,, and.
Other superscript and subscript characters
Unicode also includes codepoints for subscript and superscript characters that are intended for semantic usage, in the following blocks:;Superscript
- The Latin-1 Supplement block contains the feminine and masculine ordinal indicators and.
- The Latin Extended-C block contains one superscript,.
- The Latin Extended-D block contains seven superscripts:.
- The Latin Extended-E block contains five superscripts:.
- The Latin Extended-F block is entirely superscript IPA letters:.
- The Spacing Modifier Letters block has superscripted letters and symbols used for phonetic transcription:.
- The Phonetic Extensions block has several superscripted letters and symbols: Latin/IPA, Greek, Cyrillic, other. These are intended to indicate secondary articulation.
- The Phonetic Extensions Supplement block has several more: Latin/IPA, Greek.
- The Cyrillic Extended-B block contains two Cyrillic superscripts:.
- The Cyrillic Extended-D block contains many Cyrillic superscripts:.
- The Georgian block contains one superscripted Mkhedruli letter:.
- The Kanbun block has superscripted annotation characters used in Japanese copies of Classical Chinese texts:.
- The Tifinagh block has one superscript letter :.
- The Unified Canadian Aboriginal Syllabics and its Extended blocks contain several mostly consonant-only letters to indicate syllable coda called Finals, along with some characters that indicate syllable medial known as Medials: Main block ; Extended block:.
- The Combining Diacritical Marks block contains medieval superscript letter diacritics. These letters are written directly above other letters appearing in medieval Germanic manuscripts, and so these glyphs do not include spacing, for example uͤ. They are shown here over the dotted circle placeholder ◌:.
- The Combining Diacritical Marks Extended block contains three combining insular letters for the Middle English Ormulum,.
- The Combining Diacritical Marks Supplement block contains additional medieval superscript letter diacritics, enough to complete the basic lowercase Latin alphabet except for, and, a few small capitals and ligatures, and additional letters:, Greek.
- The Cyrillic Extended-A and -B blocks contains multiple medieval superscript letter diacritics, enough to complete the basic lowercase Cyrillic alphabet used in Church Slavonic texts, also includes an additional ligature : .
- The Cyrillic Extended-D block has one additional combining character, that being і:.
- The Latin Extended-C block contains one subscript,.
- The Phonetic Extensions block has several subscripted letters and symbols: Latin/IPA and Greek.
- The Cyrillic Extended-D block also contains many Cyrillic subscripts:.
- The Combining Diacritical Marks Supplement block contains a combining subscript:.
- The Combining Diacritical Marks Extended block contains two combining letters for linguistic transcriptions of Scots,.
Latin, Greek, Cyrillic, and IPA tables
Little punctuation is encoded. Parentheses are shown in the basic superscript block above, and the exclamation mark is shown in the IPA table below. In a supporting font, a question mark may be created with a superscript gelded question mark and a combining dot below:.
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | |
| Superscript capital | ᴬ | ᴮ | ꟲ | ᴰ | ᴱ | ꟳ | ᴳ | ᴴ | ᴵ | ᴶ | ᴷ | ᴸ | ᴹ | ᴺ | ᴼ | ᴾ | ꟴ | ᴿ | | ᵀ | ᵁ | ⱽ | ᵂ | |||
| Superscript small capital | ? | ? | ? | ᶦ | ᶫ | ᶰ | ? | ᶸ | ? | |||||||||||||||||
| Superscript minuscule | ᵃ | ᵇ | ᶜ | ᵈ | ᵉ | ᶠ | ᵍ | ʰ | ⁱ | ʲ | ᵏ | ˡ | ᵐ | ⁿ | ᵒ | ᵖ | ? | ʳ | ˢ | ᵗ | ᵘ | ᵛ | ʷ | ˣ | ʸ | ᶻ |
| Overscript small capital | ◌ᷛ | ◌ᷞ | ◌ᷟ | ◌ᷡ | ◌ᷢ | |||||||||||||||||||||
| Overscript minuscule | ◌ͣ | ◌ᷨ | ◌ͨ | ◌ͩ | ◌ͤ | ◌ᷫ | ◌ᷚ | ◌ͪ | ◌ͥ | ◌ᷜ | ◌ᷝ | ◌ͫ | ◌ᷠ | ◌ͦ | ◌ᷮ | ◌ͬ | ◌ᷤ | ◌ͭ | ◌ͧ | ◌ͮ | ◌ᷱ | ◌ͯ | ◌ᷦ | |||
| Subscript minuscule | ₐ | ₑ | ₕ | ᵢ | ⱼ | ₖ | ₗ | ₘ | ₙ | ₒ | ₚ | ᵣ | ₛ | ₜ | ᵤ | ᵥ | ₓ | |||||||||
| Underscript minuscule | ◌᷊ | ◌ᪿ |
§ Cyrillic ? ? ?, ◌ⷡ ◌ⷩ ◌ⷦ ◌ⷮ ◌ꙷ and ? might be substituted for these letters.
Some of these superscript capitals are small caps in the source documents in the Unicode proposals. Superscript Ä, Ö, Ü are composed of the base letter and a combining tréma.
Except for the iota subscript, which has use in Greek text, the modifier Greek letters are intended as phonetic characters in Latin-script text. Shaded cells are indistinguishable from Latin letters, and so would not be expected to have distinctive use in Latin text or to be supported by Unicode.
| Α | Β | Γ | Δ | Ε | Ζ | Η | Θ | Ι | Κ | Λ | Μ | Ν | Ξ | Ο | Π | Ρ | Σ | Τ | Υ | Φ | Χ | Ψ | Ω | |
| Superscript minuscule | ᵅ | ᵝ | ᵞ | ᵟ | ᵋ | ᶿ | ᶥ | ᶹ | ᵠ | ᵡ | ||||||||||||||
| Overscript minuscule | ◌ᷧ | ◌ᷩ | ◌᫇ | |||||||||||||||||||||
| Subscript minuscule | ᵦ | ᵧ | ͺ | ᵨ | ᵩ | ᵪ | ||||||||||||||||||
| Underscript minuscule | ◌ͅ | ◌̫ |
Cyrillic modifier characters are intended for use in Cyrillic text.
| А | Б | В | Г | Д | Е | Ж | З | И | К | Л | М | Н | О | П | Р | С | Т | У | Ф | Х | Ц | Ч | Ш | Щ | Ъ | Ы | Ь | Э | Ю | Я | |
| Superscript | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ᵸ | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ꚜ | ? | ꚝ | ? | ? | ||
| Overscript | ◌ⷶ | ◌ⷠ | ◌ⷡ | ◌ⷢ | ◌ⷣ | ◌ⷷ | ◌ⷤ | ◌ⷥ | ◌ꙵ | ◌ⷦ | ◌ⷧ | ◌ⷨ | ◌ⷩ | ◌ⷪ | ◌ⷫ | ◌ⷬ | ◌ⷭ | ◌ⷮ | ◌ꙷ | ◌ꚞ | ◌ⷯ | ◌ⷰ | ◌ⷱ | ◌ⷲ | ◌ⷳ | ◌ꙸ | ◌ꙹ | ◌ꙺ | ◌ⷻ | ||
| Subscript | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? |
| Ә | Ґ | Є | Ѕ | Ꚉ | І | Ї | Ј | Ө | Ҫ | Ү | Ұ | Џ | Ӏ | |
| Superscript | ? | ? | ? | ?̈ | ? | ? | ? | ? | ? | ? | ||||
| Overscript | ◌ꙴ | ◌𞂏 | ◌ꙶ | |||||||||||
| Subscript | ? | ? | ? | ?̈ | ? |
| Ꙋ | Ѡ | Ѣ | Ꙗ | Ѥ | Ѧ | Ѫ | Ѭ | Ѳ | Ꙑ | |
| Superscript | ? | |||||||||
| Overscript | ◌ⷹ | ◌ꙻ | ◌ⷺ | ◌ⷼ | ◌ꚟ | ◌ⷽ | ◌ⷾ | ◌ⷿ | ◌ⷴ |
Superscript and subscript ё, ї, й, ў etc. are handled with diacritics, etc. Many of the Cyrillic characters were added to the Cyrillic Extended-D block, which was added to the free Gentium and Andika fonts with version 6.2 in February 2023.
See also Unicode Small caps, Fullwidths, and Mathematical alphanumerics.