Chinese language

Chinese is an umbrella term for all Sinitic languages, widely recognized as a collection of language varieties, spoken natively by the ethnic Han Chinese majority and many minority ethnic groups in Greater China, as well as by various communities of the Chinese diaspora. Approximately 1.39 billion people, or 17% of the global population, speak one of the varieties of Chinese as their first language.
The different Chinese language varieties together form the largest branch of the Sino-Tibetan languages. While the Chinese government defines all spoken Chinese varieties as merely diverse dialects of a single language, the often lack of mutual intelligibility, especially among those outside of the dominant northern varieties, have led linguists to consider them as separate languages within a language family. Investigation of the historical relationships among the varieties of Chinese is ongoing. Currently, most classifications posit 7 to 13 main regional groups based on phonetic developments from Middle Chinese, of which the most spoken by far is Mandarin with 66%, or around 800 million speakers, followed by Min, Wu and Yue. These groups are unintelligible to each other, and many of their subgroups are unintelligible with other subgroups within the same branch. There are, however, transitional areas where varieties from different branches share enough features for some limited intelligibility, including New Xiang with Southwestern Mandarin, Xuanzhou Wu Chinese with Lower Yangtze Mandarin, Jin with Central Plains Mandarin and certain divergent dialects of Hakka with Gan. All varieties of Chinese are tonal at least to some degree, and are largely analytic.
The Chinese language is transcribed via a writing system consisting of logographic characters, historically in the grammatical form of Literary Chinese. The earliest attested written Chinese consists of the oracle bone inscriptions created during the Shang dynasty. The phonetic categories of Old Chinese can be reconstructed from the rhymes of ancient Chinese poetry. During the Northern and Southern dynasties, Middle Chinese went through several sound changes and split into several varieties following prolonged geographic and political separation. The Qieyun, a rhyme dictionary, recorded a compromise between the pronunciations of different regions. The royal courts of the Ming and early Qing dynasties operated using a koiné language known as Guanhua, based on the Nanjing dialect of Mandarin.
Standard Chinese, a standard language based on the Beijing dialect and first officially adopted in the 1930s, is the current official language of both the People's Republic of China and the dissident Republic of China, one of the four official languages of Singapore, and one of the six official languages of the United Nations. It is written primarily using modern written vernacular Chinese, the literacy of which is shared by educated readers who may otherwise speak mutually unintelligible varieties. Since the 1950s, the use of simplified Chinese characters has been promoted by the government of the People's Republic of China, with the government of Singapore officially adopting them in 1976. Traditional Chinese characters are still used in Taiwan, Hong Kong, Macau, areas of Malaysia with significant ethnic Chinese populations and among other overseas Chinese communities. Some ethnic minorities in Central Asia and the Russian Far East also speak varieties of Chinese but write in cyrillized scripts.

Classification

Linguists classify all varieties of Chinese as part of the Sino-Tibetan language family, together with Burmese, Tibetan and many other languages spoken in the Himalayas and the Southeast Asian Massif. Although the relationship was first proposed in the early 19th century and is now broadly accepted, reconstruction of Sino-Tibetan is much less developed than that of families such as Indo-European or Austroasiatic. Difficulties have included the great diversity of the languages, the lack of inflection in many of them, and the effects of language contact. In addition, many of the smaller languages are spoken in mountainous areas that are difficult to reach and are often also sensitive border zones. Without a secure reconstruction of Proto-Sino-Tibetan, the higher-level structure of the family remains unclear. A top-level branching into Chinese and Tibeto-Burman languages is often assumed, but has not been convincingly demonstrated.

History

The first written records appeared over 3,000 years ago during the Shang dynasty. As the language evolved over this period, the various local varieties became mutually unintelligible. In reaction, central governments have repeatedly sought to promulgate a unified standard.

Old and Middle Chinese

The earliest examples of Old Chinese are divinatory inscriptions on oracle bones dated to, during the Late Shang. The next attested stage came from inscriptions on bronze artifacts dating to the Western Zhou period, the Classic of Poetry and portions of the Book of Documents and I Ching. Scholars have attempted to reconstruct the phonology of Old Chinese by comparing later varieties of Chinese with the rhyming practice of the Classic of Poetry and the phonetic elements found in the majority of Chinese characters. Although many of the finer details remain unclear, most scholars agree that Old Chinese differs from Middle Chinese in lacking retroflex and palatal obstruents but having initial consonant clusters of some sort, and in having voiceless nasals and liquids. Most recent reconstructions also describe an atonal language with consonant clusters at the end of the syllable, developing into tone distinctions in Middle Chinese. Several derivational affixes have also been identified, but the language lacks inflection, and indicated grammatical relationships using word order and grammatical particles.
Middle Chinese was the language used during Northern and Southern dynasties and the Sui, Tang, and Song dynasties. It can be divided into an early period, reflected by the Qieyun rhyme dictionary, and a late period in the 10th century, reflected by rhyme tables such as the constructed by ancient Chinese philologists as a guide to the Qieyun system. These works define phonological categories but with little hint of what sounds they represent. Linguists have identified these sounds by comparing the categories with pronunciations in modern varieties of Chinese, borrowed Chinese words in Vietnamese, Korean, and Japanese, and transcription evidence. The resulting system is very complex, with a large number of consonants and vowels, but they are probably not all distinguished in any single dialect. Most linguists now believe it represents a diasystem encompassing 6th-century northern and southern standards for reading the classics.

Classical and vernacular written forms

The complex relationship between spoken and written Chinese is an example of diglossia: as spoken, Chinese varieties have evolved at different rates, while the written language used throughout China changed comparatively little, crystallizing into a prestige form known as Classical or Literary Chinese. Literature written distinctly in the Classical form began to emerge during the Spring and Autumn period. Its use in writing remained nearly universal until the late 19th century, culminating with the widespread adoption of written vernacular Chinese with the May Fourth Movement beginning in 1919.

Rise of northern dialects

After the fall of the Northern Song dynasty and subsequent reign of the Jurchen Jin and Mongol Yuan dynasties in northern China, a common speech developed based on the dialects of the North China Plain around the capital.
The 1324 Zhongyuan Yinyun was a dictionary that codified the rhyming conventions of new sanqu verse form in this language.
Together with the slightly later Menggu Ziyun, this dictionary describes a language with many of the features characteristic of modern Mandarin dialects.
Until the early 20th century, most Chinese people only spoke their local language. Thus, as a practical measure, officials of the Ming and Qing dynasties carried out the administration of the empire using a common language based on Mandarin varieties, known as. For most of this period, this language was a koiné based on dialects spoken in the Nanjing area, though not identical to any single dialect. By the middle of the 19th century, the Beijing dialect had become dominant and was essential for any business with the imperial court.
In the 1930s, a standard national language, was adopted. After much dispute between proponents of northern and southern languages and an abortive attempt at an artificial pronunciation, the National Language Unification Commission finally settled on the Beijing dialect in 1932. The People's Republic founded in 1949 retained this standard but renamed it. The national language is now used in education, the media, and formal situations in both mainland China and Taiwan.
In Hong Kong and Macau, Cantonese is the dominant spoken language due to cultural influence from Guangdong immigrants and colonial-era policies, and is used in education, media, formal speech, and everyday life—though Mandarin is increasingly taught in schools due to the mainland's growing influence.

Influence

Historically, the Chinese languages spread to neighbors through a variety of means. Northern Vietnam was incorporated into the Han dynasty in 111 BCE, marking the beginning of a period of Chinese control that ran almost continuously for a millennium. The Four Commanderies of Han were established in northern Korea in the 1st century BCE but disintegrated in the following centuries. Chinese Buddhism spread over East Asia between the 2nd and 5th centuries CE, and with it the study of scriptures and literature in Literary Chinese. Later, strong central governments modeled on Chinese institutions were established in Korea, Japan, and Vietnam, with Literary Chinese serving as the language of administration and scholarship, a position it would retain until the late 19th century in Korea and Japan, and the early 20th century in Vietnam. Scholars from different lands could communicate, albeit only in writing, using Literary Chinese.
Although they used Chinese solely for written communication, each country had its own tradition of reading texts aloud using what are known as Sino-Xenic pronunciations. Chinese words with these pronunciations were also extensively imported into the Korean, Japanese, and Vietnamese languages. Today, Sino-Korean, Sino-Japanese, and Sino-Vietnamese vocabularies comprise over half of their respective lexicons. This massive influx led to changes in the phonological structure of the languages, contributing to the development of moraic structure in Japanese and the disruption of vowel harmony in Korean.
Borrowed Chinese morphemes have been used extensively in all these languages to coin compound words for new concepts, in a similar way to the use of Latin and Ancient Greek roots in European languages. Many new compounds, or new meanings for old phrases, were created in the late 19th and early 20th centuries to name Western concepts and artifacts. These coinages, written in shared Chinese characters, have then been borrowed freely between languages. They have even been accepted into Chinese, a language usually resistant to loanwords, because their foreign origin was hidden by their written form. Often different compounds for the same concept were in circulation for some time before a winner emerged, and sometimes the final choice differed between countries. The proportion of vocabulary of Chinese origin thus tends to be greater in technical, abstract, or formal language. For example, in Japan, Sino-Japanese words account for about 35% of the words in entertainment magazines, over half the words in newspapers, and 60% of the words in science magazines.
Vietnam, Korea, and Japan each developed writing systems for their own languages, initially based on Chinese characters, but later replaced with the alphabet for Korean and supplemented with syllabaries for Japanese, while Vietnamese continued to be written with the complex chữ Nôm script. However, these were limited to popular literature until the late 19th century. Today Japanese is written with a composite script using both Chinese characters called kanji, and kana. Korean is written exclusively with hangul in North Korea, although knowledge of the supplementary Chinese characters called hanja is still required, and hanja are increasingly rarely used in South Korea. As a result of its historical colonization by France, Vietnamese now uses the Latin-based Vietnamese alphabet.
English words of Chinese origin include tea from Hokkien, dim sum from Cantonese, and kumquat from Cantonese.