Tangut script


The Tangut script is a logographic writing system, formerly used for writing the extinct Tangut language of the Tanguts. It was widely used during the Tangut-founded Western Xia dynasty, and fell into obscurity after its extinction. According to a 2004 count, 5,863 Tangut characters are known, excluding variants. The Tangut characters are similar in appearance to Chinese characters, with the same type of strokes, but the methods of forming characters in the Tangut writing system are significantly different from [Chinese character classification|those of forming Chinese characters]. As with Chinese, regular, running, cursive and seal scripts were used in Tangut writing.

History

According to the History of Song, the script was designed by the high-ranking official Yeli Renrong in 1036. The script was invented in a short period of time, and was put into use quickly. Government schools were founded to teach the script. Official documents were written in the script. A great number of Buddhist scriptures were translated from Tibetan and Chinese, and block printed in the script. Although the dynasty collapsed in 1227, the script continued to be used for another few centuries. The last known example of the script occurs on a pair of Tangut dharani pillars found at Baoding in present-day Hebei province, which were erected in 1502.

Structure

Tangut characters can be divided into two classes: simple and composite. The latter are much more numerous. The simple characters can be either semantic or phonetic. None of the Tangut characters are pictographic, while the Chinese characters were at the time of their creation; this is one of the major differences between Tangut and Chinese characters.
Most composite characters comprise two components. A few comprise three or four. A component can be a simple character, or part of a composite character. The composite characters include semantic-semantic ones and semantic-phonetic ones. A few special composite characters were made for transliterating Chinese and Sanskrit.
There are a number of pairs of special composite characters worth noting. The members of such a pair have the same components, only the location of the components in them is different. The members of such a pair have very similar meanings.
The Sea of Characters, a 12th century monolingual Tangut rhyming dictionary, analyzes what other characters each character is derived from. Its analyses illustrate another difference between Tangut and Chinese characters. In Chinese, typically, each semantic component has its own meaning, and each phonetic component its own sound; they contribute this meaning or sound to any complex character they appear in. By contrast, in the Sea of Characters analysis of Tangut, a component contributes the meaning or sound of some other character that contains it, potentially a different one in every appearance. For example, the component can have the meaning of "bird", as in *dze "wild goose" = *dźjwow "bird" + *dze "longevity". But the same component is also used to convey meanings of bone, smoke, food, and time, among others.
Some components take different shape depending on what part of the character they appear in.

Unicode

6,125 characters of the Tangut script were included in Unicode version 9.0 in June 2016 in the Tangut block. 755 Radicals and components used in the modern study of Tangut were added to the Tangut Components block. An iteration mark,, was included in the Ideographic Symbols and Punctuation block. Five additional characters were added in June 2018 with the release of Unicode version 11.0. Six additional characters were added in March 2019 with the release of Unicode version 12.0. A further nine Tangut ideographs were added to the Tangut Supplement block and 13 Tangut components were added to the Tangut Components block in March 2020 with the release of Unicode version 13.0. The Tangut Supplement block size was changed in Unicode version 14.0 to correct the erroneous block end point. Additional components were added with the Tangut Components Supplement block in September 2025 with the release of Unicode version 17.0.