Standard Chinese phonology


The phonology of Standard Chinese has historically derived from the Beijing dialect of Mandarin. However, pronunciation varies widely among speakers, who may introduce elements of their local varieties. Television and radio announcers are chosen for their ability to affect a standard accent. The sound system has not only segments—i.e. vowels and consonants—but also tones, and each syllable has one. In addition to the four main tones, there is a neutral tone that appears on [|weak] syllables.
This article uses the International Phonetic Alphabet to compare the phonetic values corresponding to syllables romanized with pinyin.

Consonants

The sounds shown in parentheses are sometimes not analyzed as separate phonemes; for more on these, see [|below]. Excluding these, and excluding the glides,, and, there are 19 consonant phonemes in the inventory.
Between pairs of plosives or affricates having the same place of articulation and manner of articulation, the primary distinction is not voiced vs. voiceless, but unaspirated vs. aspirated. The unaspirated plosives and affricates may however become voiced in weak syllables. In pinyin, an unaspirated/aspirated pair such as and is represented with b and p respectively.
More details about the individual consonant sounds are given in the following table.
All of the consonants may occur as the initial sound of a syllable, with the exception of . Excepting the [|rhotic coda], the only consonants that can appear in syllable coda position are and . Final, may be pronounced without complete oral closure, resulting in a syllable that in fact ends with a long nasalized vowel. See also, below.

Denti-alveolar and retroflex series

The consonants listed in the first table [|above] as denti-alveolar are sometimes described as alveolars, and sometimes as dentals. The affricates and the fricative are particularly often described as dentals; these are generally pronounced with the tongue on the lower teeth.
The retroflex consonants are actually apical rather than subapical, and so are considered by some authors not to be truly retroflex; they may be more accurately called post-alveolar. Some speakers not from Beijing may lack the retroflexes in their native dialects, and may thus replace them with dentals.

Alveolo-palatal series

The alveolo-palatal consonants have standard pronunciations of. Some speakers realize them as palatalized dentals,, ; this is claimed to be especially common among children and women, although officially it is regarded as substandard and as a feature specific to the Beijing dialect.
In phonological analysis, it is often assumed that, when not followed by one of the high front vowels or, the alveolar-palatals consist of a consonant followed by a palatal glide. That is, syllables represented in pinyin as beginning,,,,, are taken to begin,,,,,. The actual pronunciations are more like,,,,, . This is consistent with the general observation that medial glides are realized as palatalization and/or labialization of the preceding consonant.
On the above analysis, the alveolar-palatals are in complementary distribution with the dentals, with the velars, and with the retroflexes, as none of these can occur before high front vowels or palatal glides, whereas the alveolo-palatals occur before high front vowels or palatal glides. Therefore, linguists often prefer to classify not as independent phonemes, but as allophones of one of the other three series. The existence of the above-mentioned dental variants inclines some to prefer to identify the alveolo-palatals with the dentals, but identification with any of the three series is possible. The Yale and Wade–Giles systems mostly treat the alveolo-palatals as allophones of the retroflexes; Tongyong Pinyin mostly treats them as allophones of the dentals; and Mainland Chinese Braille treats them as allophones of the velars. In pinyin and bopomofo, however, they are represented as a separate sequence.
The alveolo-palatals arose historically from a merger of the dentals and velars before high front vowels and glides. Previously, some instances of modern were instead, and others were ; distinguishing these two sources of is known as the. The change took place in the last two or three centuries at different times in different areas. This explains why some European transcriptions of Chinese names contain,,, where an alveolo-palatal might be expected in modern Chinese. Examples are Peking for Beijing, Chungking for Chongqing, Fukien for Fujian, Tientsin for Tianjin ; Sinkiang for Xinjiang. The complementary distribution with the retroflex series arose when syllables that had a retroflex consonant followed by a medial glide lost the medial glide.

Zero onset

A full syllable such as ai, in which the vowel is not preceded by any of the standard initial consonants or glides, is said to have a null initial or zero onset. This may be realized as a consonant sound: and are possibilities, as are and in some non-standard varieties. It has been suggested by San Duanmu that such an onset be regarded as a special phoneme, or as an instance of the phoneme, although it can also be treated as no phoneme. By contrast, in the case of the particle a, which is a weak onset-less syllable, linking occurs with the previous syllable.
When a stressed vowel-initial Chinese syllable follows a consonant-final syllable, the consonant does not directly link with the vowel. Instead, the zero onset seems to intervene in between. becomes,. However, in connected speech none of these output forms is natural. Instead, when the words are spoken together the most natural pronunciation is rather similar to, in which there is no nasal closure or any version of the zero onset, and instead nasalization of the vowel occurs.

Glides

The glides,, and sound respectively like the y in English yes, the u in French huit, and the w in English we. The glides are commonly analyzed not as independent phonemes, but as consonantal allophones of the high vowels:. This is possible because there is no ambiguity in interpreting a sequence like yao/-iao as, and potentially problematic sequences such as do not occur.
The glides may occur in initial position in a syllable. This occurs with in the syllables written,,, and in pinyin; with in other syllables written with initial y in pinyin ; and with in syllables written with initial w in pinyin. When a glide is followed by the vowel of which that glide is considered an allophone, the glide may be regarded as epenthetic, and not as a separate realization of the phoneme. Hence the syllable, pronounced, may be analyzed as consisting of the single phoneme, and similarly may be analyzed as, as, and as. It is also possible to hear both from the same speaker, even in the same conversation. For example, one may hear the number "one" as either or.
The glides can also occur in medial position, that is, after the initial consonant but before the main vowel. Here they are represented in pinyin as vowels: for example, the i in represents, and the u in represents. There are some restrictions on the possible consonant-glide combinations: does not occur after labials ; does not occur after retroflexes and velars ; and occurs medially only in and and after alveolar-palatals. A consonant-glide combination at the start of a syllable is articulated as a single sound – the glide is not in fact pronounced after the consonant, but is realized as palatalization, labialization, or both, of the consonant.
The glides and are also found as the final element in some syllables. These are commonly analyzed as diphthongs rather than vowel-glide sequences. For example, the syllable is assigned the underlying representation.

Syllabic consonants

The syllables written in pinyin as,,,,,, may be described as a sibilant consonant followed by a syllabic consonant :
  • , a laminal denti-alveolar voiced continuant, in,, ;
  • , an apical retroflex voiced continuant, in,,, .
Alternatively, the nucleus may be described not as a syllabic consonant, but as a vowel:
  • , similar to Russian ы and the vowel in American "roses", in,,,,,, .
Phonologically, these syllables may be analyzed as having their own vowel phoneme,. However, it is possible to merge this with the phoneme , since the two are in complementary distribution – provided that the is either left un-merged, or is merged with the velars rather than the retroflex or alveolar series.
Another approach is to regard the syllables assigned above to as having an empty nuclear slot, i.e. as not containing a vowel phoneme at all. This is more consistent with the syllabic consonant description of these syllables, and is consistent with the view that phonological representations are minimal. When this is the case, sometimes the phoneme is described as shifting from voiceless to voiced, e.g. becoming.
Syllabic consonants may also arise as a result of weak syllable reduction; see below. Syllabic nasal consonants are also heard in certain interjections; pronunciations of such words include,,,,.

Vowels

Standard Chinese can be analyzed as having between two and six vowel phonemes. are high vowels, is mid whereas is low.
The precise realization of each vowel depends on its phonetic environment. In particular, the vowel has two broad allophones and . These sounds can be treated as a single underlying phoneme because they are in complementary distribution. The mid vowel phoneme may also be treated as an under-specified vowel, attracting features either from the adjacent sounds or from default rules resulting in.
Transcriptions of the vowels' allophones differ somewhat between sources. More details about the individual vowel allophones are given in the following table.
PhonemeAllophoneDescriptionExamplePinyinWade–GilesGwoyeu Romatzyh
Depends on analysis
Like English ee as in bee比/bǐiii
Depends on analysis
Like English oo as in boo不/bùuuu
Depends on analysis
Like English oo in took 空/kōngouo
Depends on analysis
Like French u or German ü女/nǚü, uüiu
Somewhat like English ey as in prey别/biée, êe, ehe
Somewhat like southern British English awe or Scottish English oh火/huǒooo
Pronounced as a sequence.和/héeê, oe
Schwa, like English a as in about.很/hěneê, ue
Like English a as in palm巴/bāaaa
Like English e as in then 边/biānae, aa

Zhuyin represents vowels differently from normal romanisation schemes, and as such is not displayed in the above table.
The vowel nuclei may be preceded by a glide, and may be followed by a coda. The various combinations of glide, vowel, and coda have different surface manifestations, as shown in the tables below. Any of the three positions may be empty, i.e. occupied by a null meta-phoneme.