Variant form (Unicode)
A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.
A variant form usually has a very similar appearance and meaning as its base form. The mechanism is intended for variant forms where, generally, if the variant form is unavailable, displaying the base character does not change the meaning of the text, and may not even be noticeable to many readers.
Unicode defines two types of variation sequences:
- Standardized variation sequences defined in StandardizedVariants.txt
- Ideographic variation sequences defined in the Ideographic Variation Database
- Variation Selectors
- Variation Selectors Supplement
- Mongolian
For other glyph substitution, the author's intent may need to be encoded with the text and cannot be determined contextually. This is the case with character/glyphs referred to as gaiji, where different glyphs are used for the same character either historically or for ideographs for family names. This is one of the gray areas in distinguishing between a glyph and a character: If a family name differs slightly from the ideograph character it derives from, then is that a simple glyph variant or a character variant?
Character substitutions may also occur outside of Unicode, for example with OpenType Layout tags.
Blocks with standardized variation sequences
, standardized variation sequences specifically for emoji/text presentation are defined for base characters in 20 blocks:- Arrows
- Basic Latin
- CJK Symbols and Punctuation
- Dingbats
- Emoticons
- Enclosed Alphanumeric Supplement
- Enclosed Alphanumerics
- Enclosed CJK Letters and Months
- Enclosed Ideographic Supplement
- General Punctuation
- Geometric Shapes
- Latin-1 Supplement
- Letterlike Symbols
- Mahjong Tiles
- Miscellaneous Symbols
- Miscellaneous Symbols and Arrows
- Miscellaneous Symbols and Pictographs
- Miscellaneous Technical
- Supplemental Arrows-B
- Transport and Map Symbols
- CJK Unified Ideographs
- CJK Unified Ideographs Extension A
- CJK Unified Ideographs Extension B
- Egyptian Hieroglyph Format Controls
- Egyptian Hieroglyphs
- Egyptian Hieroglyphs Extended-A
- Halfwidth and Fullwidth Forms
- Manichaean
- Mathematical Alphanumeric Symbols
- Mathematical Operators
- Miscellaneous Mathematical Symbols-B
- Mongolian
- Myanmar
- Myanmar Extended-A
- Phags-pa
- Supplemental Mathematical Operators
Blocks with ideographic variation sequences
- CJK Compatibility Ideographs
- CJK Unified Ideographs
- CJK Unified Ideographs Extension A
- CJK Unified Ideographs Extension B
- CJK Unified Ideographs Extension C
- CJK Unified Ideographs Extension D
- CJK Unified Ideographs Extension E
- CJK Unified Ideographs Extension F
- CJK Unified Ideographs Extension G
- CJK Unified Ideographs Extension H
- CJK Unified Ideographs Extension I