Austroasiatic languages


The Austroasiatic languages are a large language family spoken throughout Mainland Southeast Asia, South Asia and East Asia. These languages are natively spoken by the majority of the population in Vietnam and Cambodia, and by minority populations scattered throughout parts of Thailand, Laos, India, Myanmar, Malaysia, Bangladesh, Nepal, and southern China. Approximately 117 million people speak an Austroasiatic language, of which more than two-thirds are Vietnamese speakers. Of the Austroasiatic languages, only Vietnamese, Khmer, and Mon have lengthy, established presences in the historical record. Only two are presently considered to be the national languages of sovereign states: Vietnamese in Vietnam, and Khmer in Cambodia. The Mon language is a recognized indigenous language in Myanmar and Thailand, while the Wa language is a "recognized national language" in the de facto autonomous Wa State within Myanmar. Santali is one of the 22 scheduled languages of India. The remainder of the family's languages are spoken by minority groups and have no official status.
Ethnologue identifies 168 Austroasiatic languages. These form thirteen established families that have traditionally been grouped into two, as Mon–Khmer, and Munda. However, one recent classification posits three groups, while another has abandoned Mon–Khmer as a taxon altogether, making it synonymous with the larger family.
Scholars generally date the ancestral language to with a homeland in southern China or the Mekong River valley. Sidwell proposes that the locus of Proto-Austroasiatic was in the Red River Delta area around. Genetic and linguistic research in 2015 about ancient people in East Asia suggest an origin and homeland of Austroasiatic in today's southern China or even further north.

Etymology

The name Austroasiatic was coined by Wilhelm Schmidt based on auster, the Latin word for "South", and "Asia". Despite the literal meaning of its name, only three Austroasiatic branches are actually spoken in South Asia: Khasic, Munda, and Nicobarese.

Typology

Regarding word structure, Austroasiatic languages are well known for having an iambic "sesquisyllabic" pattern, with basic nouns and verbs consisting of an initial, unstressed, reduced minor syllable followed by a stressed, full syllable. This reduction of presyllables has led to a variety of phonological shapes of the same original Proto-Austroasiatic prefixes, such as the causative prefix, ranging from CVC syllables to consonant clusters to single consonants among the modern languages. As for word formation, most Austroasiatic languages have a variety of derivational prefixes, and many have infixes, but suffixes are almost completely non-existent in most branches except Munda, and a few specialized exceptions in other Austroasiatic branches.
The Austroasiatic languages are further characterized as having unusually large vowel inventories and employing some sort of pitch register contrast, either between modal voice and breathy voice or between modal voice and creaky voice. Languages in the Pearic branch and some in the Vietic branch can have a three- or even four-way voicing contrast.
However, some Austroasiatic languages have lost the register contrast by evolving more diphthongs or in a few cases, such as Vietnamese, tonogenesis. Vietnamese has been so heavily influenced by Chinese that its original Austroasiatic phonological quality is obscured and now resembles that of South Chinese languages, whereas Khmer, which had more influence from Sanskrit, has retained a more typically Austroasiatic structure.

Proto-language

Much work has been done on the reconstruction of Proto-Mon–Khmer in Harry L. Shorto's Mon–Khmer Comparative Dictionary. Little work has been done on the Munda languages, which are poorly documented. Proto-Mon–Khmer becomes synonymous with the Proto-Austroasiatic language with their demotion from a primary branch. Paul Sidwell reconstructs the consonant inventory of Proto-Mon–Khmer as follows:
This is identical to earlier reconstructions except for. is better preserved in the Katuic languages, which Sidwell has specialized in.

Internal classification

Linguists traditionally recognize two primary divisions of Austroasiatic: the Mon–Khmer languages of Southeast Asia, Northeast India, and the Nicobar Islands, and the Munda languages of East and Central India and parts of Bangladesh and Nepal. However, no evidence for this classification has ever been published.
Each family written in boldface below is accepted as a valid clade. By contrast, the relationships between these families within Austroasiatic are debated. In addition to the traditional classification, two recent proposals are given, neither of which accepts traditional "Mon–Khmer" as a valid unit. However, little of the data used for competing classifications has ever been published and, therefore, cannot be evaluated by peer review.
In addition, there are suggestions that additional branches of Austroasiatic might be preserved in substrata of Acehnese in Sumatra, the Chamic languages of Vietnam, and the Land Dayak languages of Borneo.

Diffloth (1974)

's widely cited original classification, now abandoned by Diffloth himself, is used in Encyclopædia Britannica and—except for the breakup of Southern Mon–Khmer—in Ethnologue.
  • Austro‑Asiatic
  • * Munda
  • ** North Munda
  • *** Korku
  • *** Kherwarian
  • ** South Munda
  • *** Kharia–Juang
  • *** Koraput Munda
  • * Mon–Khmer
  • ** Eastern Mon–Khmer
  • *** Khmer
  • *** Pearic
  • *** Bahnaric
  • *** Katuic
  • *** Vietic
  • ** Northern Mon–Khmer
  • *** Khasi
  • *** Palaungic
  • *** Khmuic
  • ** Southern Mon–Khmer
  • *** Mon
  • *** Aslian
  • *** '''Nicobarese'''

    Peiros (2004)

Peiros is a lexicostatistic classification, based on percentages of shared vocabulary. This means that languages can appear to be more distantly related than they actually are due to language contact. Indeed, when Sidwell replicated Peiros's study with languages known well enough to account for loans, he did not find the internal structure below.
  • Austro‑Asiatic
  • * Nicobarese
  • * Munda–Khmer
  • ** Munda
  • ** Mon–Khmer
  • *** Khasi
  • *** Nuclear Mon–Khmer
  • **** Mangic
  • **** Vietic
  • **** Northern Mon–Khmer
  • ***** Palaungic
  • ***** Khmuic
  • **** Central Mon–Khmer
  • ***** Khmer dialects
  • ***** Pearic
  • ***** Asli-Bahnaric
  • ****** Aslian
  • ****** Mon–Bahnaric
  • ******* Monic
  • ******* Katu–Bahnaric
  • ******** Katuic
  • ******** '''Bahnaric'''

    Diffloth (2005)

compares reconstructions of various clades, and attempts to classify them based on shared innovations, though like other classifications the evidence has not been published. As a schematic, we have:
Or in more detail,
  • Austro‑Asiatic
  • * Munda languages
  • ** Koraput: 7 languages
  • ** Core Munda languages
  • *** Kharian–Juang: 2 languages
  • *** North Munda languages
  • **** Korku
  • **** Kherwarian: 12 languages
  • * Khasi–Khmuic languages
  • ** Khasian: 3 languages of north eastern India and adjacent region of Bangladesh
  • ** Palaungo-Khmuic languages
  • *** Khmuic: 13 languages of Laos and Thailand
  • *** Palaungo-Pakanic languages
  • **** Pakanic or Palyu: 4 or 5 languages of southern China and Vietnam
  • **** Palaungic: 21 languages of Burma, southern China, and Thailand
  • * Nuclear Mon–Khmer languages
  • ** Khmero-Vietic languages
  • *** Vieto-Katuic languages ?
  • **** Vietic: 10 languages of Vietnam and Laos, including Muong and Vietnamese, which has the most speakers of any Austroasiatic language.
  • **** Katuic: 19 languages of Laos, Vietnam, and Thailand.
  • *** Khmero-Bahnaric languages
  • **** Bahnaric: 40 languages of Vietnam, Laos, and Cambodia.
  • **** Khmeric languages
  • ***** The Khmer dialects of Cambodia, Thailand, and Vietnam.
  • ***** Pearic: 6 languages of Cambodia.
  • ** Nico-Monic languages
  • *** Nicobarese: 6 languages of the Nicobar Islands, a territory of India.
  • *** Asli-Monic languages
  • **** Aslian: 19 languages of peninsular Malaysia and Thailand.
  • **** Monic: 2 languages, the Mon language of Burma and the Nyahkur language of Thailand.

    Sidwell (2009–2015)

, in a lexicostatistical comparison of 36 languages that are well known enough to exclude loanwords, finds little evidence for internal branching, though he did find an area of increased contact between the Bahnaric and Katuic languages, such that languages of all branches apart from the geographically distant Munda and Nicobarese show greater similarity to Bahnaric and Katuic the closer they are to those branches, without any noticeable innovations common to Bahnaric and Katuic.
He therefore takes the conservative view that the thirteen branches of Austroasiatic should be treated as equidistant on current evidence. Sidwell & Blench discuss this proposal in more detail, and note that there is good evidence for a Khasi–Palaungic node, which could also possibly be closely related to Khmuic.
If this would the case, Sidwell & Blench suggest that Khasic may have been an early offshoot of Palaungic that had spread westward. Sidwell & Blench suggest Shompen as an additional branch, and believe that a Vieto-Katuic connection is worth investigating. In general, however, the family is thought to have diversified too quickly for a deeply nested structure to have developed, since Proto-Austroasiatic speakers are believed by Sidwell to have radiated out from the central Mekong river valley relatively quickly.
Subsequently, Sidwell proposed that Nicobarese subgroups with Aslian, just as how Khasian and Palaungic subgroup with each other.
A subsequent computational phylogenetic analysis suggests that Austroasiatic branches may have a loosely nested structure rather than a completely rake-like structure, with an east–west division occurring possibly as early as 7,000 years before present. However, he still considers the subbranching dubious.
Integrating computational phylogenetic linguistics with recent archaeological findings, Paul Sidwell further expanded his Mekong riverine hypothesis by proposing that Austroasiatic had ultimately expanded into Indochina from the Lingnan area of southern China, with the subsequent Mekong riverine dispersal taking place after the initial arrival of Neolithic farmers from southern China.
Sidwell tentatively suggests that Austroasiatic may have begun to split up 5,000 years B.P. during the Neolithic transition era of mainland Southeast Asia, with all the major branches of Austroasiatic formed by 4,000 B.P. Austroasiatic would have had two possible dispersal routes from the western periphery of the Pearl River watershed of Lingnan, which would have been either a coastal route down the coast of Vietnam, or downstream through the Mekong River via Yunnan. Both the reconstructed lexicon of Proto-Austroasiatic and the archaeological record clearly show that early Austroasiatic speakers around 4,000 B.P. cultivated rice and millet, kept livestock such as dogs, pigs, and chickens, and thrived mostly in estuarine rather than coastal environments.
At 4,500 B.P., this "Neolithic package" suddenly arrived in Indochina from the Lingnan area without cereal grains and displaced the earlier pre-Neolithic hunter-gatherer cultures, with grain husks found in northern Indochina by 4,100 B.P. and in southern Indochina by 3,800 B.P. However, Sidwell found that iron is not reconstructable in Proto-Austroasiatic, since each Austroasiatic branch has different terms for iron that had been borrowed relatively lately from Tai, Chinese, Tibetan, Malay, and other languages.
During the Iron Age about 2,500 B.P., relatively young Austroasiatic branches in Indochina such as Vietic, Katuic, Pearic, and Khmer were formed, while the more internally diverse Bahnaric branch underwent more extensive internal diversification. By the Iron Age, all of the Austroasiatic branches were more or less in their present-day locations, with most of the diversification within Austroasiatic taking place during the Iron Age.
Paul Sidwell considers the Austroasiatic language family to have rapidly diversified around 4,000 years B.P. during the arrival of rice agriculture in Indochina, but notes that the origin of Proto-Austroasiatic itself is older than that date. The lexicon of Proto-Austroasiatic can be divided into an early and late stratum. The early stratum consists of basic lexicon including body parts, animal names, natural features, and pronouns, while the names of cultural items form part of the later stratum.
Roger Blench suggests that vocabulary related to aquatic subsistence strategies can be reconstructed for Proto-Austroasiatic. Blench finds widespread Austroasiatic roots for 'river, valley', 'boat', 'fish', 'catfish sp.', 'eel', 'prawn', 'shrimp', 'crab', 'tortoise', 'turtle', 'otter', 'crocodile', 'heron, fishing bird', and 'fish trap'. Archaeological evidence for the presence of agriculture in northern Indochina dates back to only about 4,000 years ago, with agriculture ultimately being introduced from further up to the north in the Yangtze valley where it has been dated to 6,000 B.P.
Sidwell proposes that the locus of Proto-Austroasiatic was in the Red River Delta area about 4,000-4,500 years before present, instead of the Middle Mekong as he had previously proposed. Austroasiatic dispersed coastal maritime routes and also upstream through river valleys. Khmuic, Palaungic, and Khasic resulted from a westward dispersal that ultimately came from the Red River valley. Based on their current distributions, about half of all Austroasiatic branches can be traced to coastal maritime dispersals.
Hence, this points to a relatively late riverine dispersal of Austroasiatic as compared to Sino-Tibetan, whose speakers had a distinct non-riverine culture. In addition to living an aquatic-based lifestyle, early Austroasiatic speakers would have also had access to livestock, crops, and newer types of watercraft. As early Austroasiatic speakers dispersed rapidly via waterways, they would have encountered speakers of older language families who were already settled in the area, such as Sino-Tibetan.