Written Chinese
Written Chinese is a writing system that transcribes the varieties of Chinese language using logograms — known as characters — and other symbols such as punctuations. Chinese characters do not directly represent pronunciation, unlike letters in an alphabet or syllabograms in a syllabary. Rather, the writing system is morphosyllabic: characters are one spoken syllable in length, but generally correspond to morphemes in the language, which may either be independent words, or part of a polysyllabic word. Most characters are constructed from smaller components known as radicals or pianpang that may reflect the character's meaning or pronunciation. Literacy requires the memorization of thousands of characters; college-educated Chinese speakers know approximately 4,000 characters. This has led in part to the modern adoption of complementary phonetic transcription systems such as Pinyin and Bopomofo to transliterate the pronunciation of each character.
Chinese writing is first attested during the late Shang dynasty in the form of oracle bone script, but the process of creating character-like symbols is thought to have begun centuries earlier during the Late Neolithic and early Bronze Age. During the Zhou dynasty, Chinese characters evolved into the more mature bronze script and seal script, which were standardized under the short-lived Qin dynasty and further consolidated into the more convenient clerical script during the subsequent Han dynasty. Over the following millennia, these characters have evolved into well-developed styles of Chinese calligraphy, from the formal regular script to the more casual running script and cursive script. As the different Sinitic varieties diverged, a situation of diglossia developed, with speakers of otherwise mutually unintelligible varieties able to communicate through writing using Literary Chinese. In the early 20th century, Literary Chinese was replaced in large part with written vernacular Chinese, largely corresponding to the grammar of modern Standard Chinese, a standard form based on the Beijing dialect of Mandarin. Although most other Chinese varieties are not written in their own vernacular styles, there are local traditions of written Cantonese, written Shanghainese and written Hokkien, among others.
Structure
Written Chinese is not based on an alphabet or syllabary. Most characters can be analyzed as compounds of smaller components, which may be assembled according to several different principles. Characters and components may reflect aspects of meaning or pronunciation. The best known exposition of Chinese character composition is the Shuowen Jiezi, compiled by Xu Shen. Xu did not have access to the earliest forms of Chinese characters, and his analysis is not considered to fully capture the nature of the writing system. Nevertheless, no later work has supplanted the Shuowen Jiezi in terms of breadth, and it is still relevant to etymological research today.Derivation of characters
According to the Shuowen Jiezi, Chinese characters are developed on six basic principles. The first two principles produce simple characters, known as :- Pictographs : in which the character is a graphical depiction of the object it denotes.
- Indicatives : in which the character represents an abstract notion.
- Ideographic compounds : in which two or more parts are used for their meaning. This yields a composite meaning, which is then applied to the new character.
- Phono-semantic compounds : in which one part—often called the radical—indicates the general semantic category of the character, such as being related to water or eyes, with the other part being another character used for its phonetic value.
- Transference : in which a character, often with a simple, concrete meaning takes on an extended, more abstract meaning.
- Loangraphs : in which a character is used, either intentionally or accidentally, for some entirely different purpose.
Chinese characters are written to fit into a square, even when composed of two simpler forms written side-by-side or top-to-bottom. In such cases, each form is compressed to fit the entire character into a square.
Strokes
Character components can be further subdivided into individual written strokes. The strokes of Chinese characters fall into eight main categories: "horizontal", "vertical", "left-falling", "right-falling", "rising", "dot", "hook", and "turning",,.There are eight basic rules of stroke order in writing a Chinese character, which apply only generally and are sometimes violated:
- Horizontal strokes are written before vertical ones.
- Left-falling strokes are written before right-falling ones.
- Characters are written from top to bottom.
- Characters are written from left to right.
- If a character is framed from above, the frame is written first.
- If a character is framed from below, the frame is written last.
- Frames are closed last.
- In a symmetrical character, the middle is drawn first, then the sides.
Layout
As characters are essentially rectilinear and are not joined with one another, written Chinese does not require a set orientation. Chinese texts were traditionally written in columns from top to bottom, which were laid out from right to left. Prior to the 20th century, Literary Chinese used little to no punctuation, with the breaks between sentences and phrases determined largely by context and the rhythms implied by patterns of syllables.In the 20th century, the layout used in Western scripts—where text is written in rows from left to right, which are laid out from top to bottom—became predominant in mainland China, where it was mandated by the Chinese government in 1955. Vertical layouts are still used for aesthetic effect, or when space limitations require it, such as on signage or book spines. The government of Taiwan followed suit in 2004 for official documents, but vertical layouts have persisted in some books and newspapers.
Less frequently, Chinese is written in rows from right to left, usually on signage or banners, though a left to right orientation remains more common.
The use of punctuation has also become more common. In general, punctuation occupies the width of a full character, such that text remains visually well-aligned in a grid. Punctuation used in simplified Chinese shows clear influence from that used in Western scripts, though some marks are particular to Asian languages. For example, there are double and single quotation marks, and a hollow full stop, which is used to separate sentences in an identical manner to a Western full stop. A special mark called an enumeration comma is used to separate items in a list, as opposed to the clauses in a sentence.
History
Written Chinese is one of the oldest continuously used writing systems. The earliest examples universally accepted as Chinese writing are the oracle bone inscriptions made during the reign of the Shang king Wu Ding. These inscriptions were made primarily on ox scapulae and turtle shells in order to record the results of divinations conducted by the Shang royal family. Characters posing a question were first carved into the bones. The question's answer was then divined by heating the bones over a fire and interpreting the resulting cracks that formed. The interpretation was then carved into the same oracle bone.In 2003, 11 isolated symbols carved on tortoise shells were found at the Jiahu archaeological site in Henan—with some bearing a striking resemblance to certain modern characters, such as. The Jiahu site dates from, predating the earliest attested Chinese writing by more than 5,000 years. Garman Harbottle, who had headed a team of archaeologists at the University of [Science and Technology of China] in Anhui—has suggested that these symbols were precursors to Chinese writing. However, the palaeographer David Keightley argues instead that the time gap is too great to establish any connection.
From the Late Shang period, Chinese writing evolved into the form found in cast inscriptions on ritual bronzes made during the Western Zhou dynasty and the Spring and Autumn period, a form of writing called bronze script. Bronze script characters are less angular than their oracle bone script counterparts. The script became increasingly regularized during the Warring States period, settling into what is called, that Xu Shen used as source material in the Shuowen Jiezi. These characters were later embellished and stylized to yield the seal script, which represents the oldest form of Chinese characters still in modern use. They are used principally for signature seals, or chops, which are often used in place of a signature for Chinese documents and artwork. Li Si promulgated the seal script as the standard throughout China, which had been recently united under the imperial Qin dynasty.
The initial adaptation of seal into clerical script can be attributed to scribes in the state of Qin working prior to the wars of unification. Clerical script forms generally have a "flat" appearance, being wider than their seal script equivalents. In the semi-cursive script that evolved from clerical script, character elements begin to run into each other, though the characters themselves generally remain discrete. This is contrasted with fully cursive script, where characters are often rendered unrecognizable by their canonical forms. Regular script is the most widely recognized script, and was considerably influenced by semi-cursive. In regular script, each stroke of each character is clearly drawn out from the others.
Regular script is considered the archetypal Chinese writing and forms the basis for most printed forms. In addition, regular script imposes a stroke order, which must be followed in order for the characters to be written correctly. Strictly speaking, this stroke order applies to the clerical, running, and grass scripts as well, but especially in the running and grass scripts, this order is occasionally deviated from. Thus, for instance, the character must be written starting with the horizontal stroke, drawn from left to right; next, the vertical stroke, from top to bottom; next, the left diagonal stroke, from top to bottom; and lastly the right diagonal stroke, from top to bottom.
Simplification and standardization
Beginning in the mid-20th century, Chinese has primarily been written using either simplified or traditional character forms. Simplified characters, which merge some character forms and reduce the average stroke count per character, were developed by the Chinese government with the stated goal of increasing literacy among the population. During this time, literacy rates did increase rapidly, but some observers instead attribute this to other education reforms and a general increase in the standard of living. Little systematic research has been conducted to support the conclusion that the use of simplified characters has affected literacy rates; studies conducted in China have instead focused on arbitrary statistics, such as quantifying the number of strokes saved on average in a given text sample. Simplified characters are standard in mainland China, Singapore and Malaysia, while traditional characters are standard in Hong Kong, Macau, Taiwan and some overseas Chinese communities.Simplified forms have also been characterized as being inconsistent. For instance, the traditional is simplified to, in which the phonetic on the right side is reduced from 17 strokes to 3, and the radical on the left also being simplified. However, the same phonetic component is not reduced in simplified characters such as and —these characters are relatively uncommon, and would therefore represent a negligible stroke reduction. Other simplified forms derive from long-standing calligraphic abbreviations, as with, which has the traditional form of.
Function
Chinese characters have always been used to represent individual spoken syllables. While writing was being invented in the Yellow River valley, words in spoken Chinese were largely monosyllabic, and each written character corresponded to a monosyllabic word. Spoken Chinese varieties have since acquired much more polysyllabic vocabulary, usually compound words composed of morphemes corresponding to older monosyllabic words.For over two thousand years, the predominant form of written Chinese was Literary Chinese, which had vocabulary and syntax rooted in the language of the Chinese classics, as spoken around the time of Confucius. Over time, Literary Chinese acquired some elements of grammar and vocabulary from various varieties of vernacular Chinese that had since diverged. By the 20th century, Literary Chinese was distinctly different from any spoken vernacular, and had to be learned separately. Once learned, it was a common medium for communication between people speaking different dialects, many of which were mutually unintelligible by the end of the first millennium CE.Image:Shuowen.jpg|thumb|right|A 12th-century Song dynasty redaction of the Shuowen JieziVarieties of Chinese vary in pronunciation, and to a lesser extent in vocabulary and grammar. Modern written Chinese, which replaced Classical Chinese as the written standard as an indirect result of the 1919 May Fourth Movement, is not technically bound to any single variety; however, it most nearly represents the vocabulary and syntax of Mandarin, by far the most widespread Chinese dialectal family in terms of both geographical area and number of speakers. This form is known as written vernacular Chinese. While some written vernacular Chinese expressions are often ungrammatical or unidiomatic outside of Mandarin, its use permits some communication between speakers of different dialects. This function may be considered analogous to that of linguae francae, such as Latin. For literate speakers, it serves as a common medium; however, the forms of individual characters generally provide little insight to their meaning if not already known. Ghil'ad Zuckermann's exploration of phono-semantic matching in Standard Chinese concludes that the Chinese writing system is multifunctional, conveying both semantic and phonetic content.
The variation in vocabulary among varieties has also led to informal use of "dialectal characters", which may include characters previously used in Literary Chinese that are considered archaic in written Standard Chinese. Cantonese is unique among non-Mandarin regional languages in having a written colloquial standard, used in Hong Kong and overseas, with a large number of unofficial characters for words particular to this language. Written Cantonese has become quite popular on the Internet, while Standard Chinese is still normally used in formal written communications. To a lesser degree, Hokkien is used similarly in Taiwan and elsewhere, though it lacks the level of standardization seen in Cantonese. However, Taiwan's Ministry of Education has promulgated a standard character set for Hokkien, which is taught in schools and encouraged for use by the general population.
Media
Over the history of written Chinese, a variety of media have been used for writing. They include:- Bamboo and wooden slips, from at least the 13th century BCE
- Paper, invented no later than the 2nd century BCE
- Silk, since at least the Han dynasty
- Stone, metal, wood, bamboo, plastic and ivory on seals.
Literacy
Because the majority of modern Chinese words contain more than one character, there are at least two measuring sticks for Chinese literacy: the number of characters known, and the number of words known. John DeFrancis, in the introduction to his Advanced Chinese Reader, estimates that a typical Chinese college graduate recognizes 4,000 to 5,000 characters, and 40,000 to 60,000 words. Jerry Norman, in Chinese, places the number of characters somewhat lower, at 3,000 to 4,000. These counts are complicated by the tangled development of Chinese characters. In many cases, a single character came to have multiple variants. This development was restrained to an extent by the standardization of the seal script during the Qin dynasty, but soon started again. Although the Shuowen Jiezi lists 10,516 characters—9,353 of them unique plus 1,163 graphic variants—the Jiyun of the Northern Song dynasty, compiled less than a thousand years later in 1039, contains 53,525 characters, most of them graphic variants.Dictionaries
Written Chinese is not based on an alphabet or syllabary, so Chinese dictionaries, as well as dictionaries that define Chinese characters in other languages, cannot easily be alphabetized or otherwise lexically ordered, as English dictionaries are. The need to arrange Chinese characters in order to permit efficient lookup has given rise to a considerable variety of ways to organize and index the characters.A traditional mechanism is the method of radicals, which uses a set of character roots. These roots, or radicals, generally but imperfectly align with the parts used to compose characters by means of logical aggregation and phonetic complex. A canonical set of 214 radicals was developed during the rule of the Kangxi Emperor ; these are sometimes called the Kangxi radicals. The radicals are ordered first by stroke count ; within a given stroke count, the radicals also have a prescribed order.
Every Chinese character falls under the heading of exactly one of these 214 radicals. In many cases, the radicals are themselves characters, which naturally come first under their own heading. All other characters under a given radical are ordered by the stroke count of the character. Usually, however, there are still many characters with a given stroke count under a given radical. At this point, characters are not given in any recognizable order; the user must locate the character by going through all the characters with that stroke count, typically listed for convenience at the top of the page on which they occur.
Because the method of radicals is applied only to the written character, one need not know how to pronounce a character before looking it up; the entry, once located, usually gives the pronunciation. However, it is not always easy to identify which of the various roots of a character is the proper radical. Accordingly, dictionaries often include a list of hard to locate characters, indexed by total stroke count, near the beginning of the dictionary. Some dictionaries include almost one-seventh of all characters in this list. Alternatively, some dictionaries list "difficult" characters under more than one radical, with all but one of those entries redirecting the reader to the "canonical" location of the character according to Kangxi.
Other methods of organization exist, often in an attempt to address the shortcomings of the radical method, but are less common. For instance, it is common for a dictionary ordered principally by the Kangxi radicals to have an auxiliary index by pronunciation, expressed typically in either pinyin or bopomofo. This index points to the page in the main dictionary where the desired character can be found. Other methods use only the structure of the characters, such as the four-corner method, in which characters are indexed according to the kinds of strokes located nearest the four corners, or the Cangjie method, in which characters are broken down into a set of 24 basic components. Neither the four-corner method nor the Cangjie method requires the user to identify the proper radical, although many strokes or components have alternate forms, which must be memorized in order to use these methods effectively.
The availability of computerized Chinese dictionaries now makes it possible to look characters up by any of the indexing schemes described, thereby shortening the search process.
Transliteration
Chinese characters do not reliably indicate their pronunciation. Therefore, many transliteration systems have been developed to write the sounds of different varieties of Chinese. While many use the Latin alphabet, systems using the Cyrillic and Perso-Arabic alphabets have also been designed. Among other purposes, these systems are used by students learning the corresponding varieties. The replacement of Chinese characters with a phonetic writing system was first prominently proposed during the May Fourth Movement, partly motivated by a desire to increase the country's literacy rate. The idea gained further support following the victory of the Communists in 1949, who immediately began two parallel programs regarding written Chinese. The first was the development of an alphabet to write the sounds of Mandarin, the variety spoken by around two-thirds of the Chinese population. The other program investigated the simplification of the standard character forms. Initially, character simplification was not competing with the idea of a phonetic script; rather, simplification was intended to make the transition to a fully phonetic writing system easier.By 1958, official priorities had shifted towards character simplification. The Hanyu Pinyin alphabet had been developed, but plans to replace Chinese characters with it were deferred, and the idea is no longer actively pursued. This change in priorities may have been due in part to pinyin's design being specific to Mandarin, to the exclusion of other dialects.
Pinyin uses the Latin alphabet with diacritics to represent the phonology of Standard Chinese. For the most part, pinyin uses phonetic values for letters that reflect their existing pronunciations in Romance languages and the International Phonetic Alphabet. However, pairs of letters such as and that correspond to a voicing distinction in languages such as French instead represent the aspiration distinction that is more abundant in Mandarin. Pinyin also uses several consonantal letters to represent markedly different sounds from their assignments in other languages. For example, pinyin and correspond to sounds similar to English ch and sh, respectively. While pinyin has become the predominant transliteration system for Mandarin, others include bopomofo, Wade–Giles, Yale, EFEO and Gwoyeu Romatzyh.