Centum and satem languages

Languages of the Indo-European family are classified as either centum languages or satem languages according to how the dorsal consonants of the reconstructed Proto-Indo-European language developed. An example of the different developments is provided by the words for "hundred" found in the early attested Indo-European languages. In centum languages, they typically began with a sound, but in satem languages, they often began with .
The table below shows the traditional reconstruction of the PIE dorsal consonants, with three series, but according to [|some more recent theories] there may actually have been only two series or three series with different pronunciations from those traditionally ascribed. In centum languages, the palatovelars, which included the initial consonant of the "hundred" root, merged with the plain velars. In satem languages, they remained distinct, and the labiovelars merged with the plain velars.

	*kʷ	*gʷ	*gʷʰ	labiovelars	Merged in satem languages
Merged in centum languages	*k	*g	*gʰ	plain velars	Merged in satem languages
Merged in centum languages	*ḱ	*ǵ	*ǵʰ	palatovelars	Assibilated in satem languages

The centum–satem division forms an isogloss in synchronic descriptions of Indo-European languages. It is no longer thought that the PIE language split first into centum and satem branches from which all the centum and all the satem languages, respectively, would have derived. Such a division is made particularly unlikely by the discovery that while the satem group lies generally to the east and the centum group to the west, the most eastward of the known IE language branches, Tocharian, is centum.

Centum languages

The centum languages of the Indo-European family are the "western" branches: Hellenic, Celtic, Italic and Germanic. They merged PIE palatovelars and plain velars, yielding plain velars only, but retained the labiovelars as a distinct set.
The Anatolian branch probably falls outside the centum–satem division; for instance, the Luwian language indicates that all three dorsal consonant rows survived separately in Proto-Anatolian.
The centumisation observed in Hittite is therefore assumed to have occurred only after the breakup of Proto-Anatolian into separate languages. However, Craig Melchert proposes that Proto-Anatolian is indeed a centum language.
While Tocharian is generally regarded as a centum language, it is a special case, as it has merged all three of the PIE dorsal series into a single phoneme, *k. According to some scholars, that complicates the classification of Tocharian within the centum–satem model. However, as Tocharian has replaced some PIE labiovelars with the labiovelar-like, non-original sequence *ku, it has been proposed that labiovelars remained distinct in Proto-Tocharian, which would place Tocharian in the centum group.
In the centum languages, PIE roots reconstructed with palatovelars developed into forms with plain velars. For example, in the PIE numeral ḱm̥tóm 'hundred', the initial palatovelar ḱ became a plain velar /k/, as in Latin centum, Greek katon, Welsh cant, Tocharian B kante. In the Germanic languages, the /k/ developed regularly by Grimm's law to become /h/, as in Old English hund.
Centum languages also retained the distinction between the PIE labiovelar row and the plain velars. Historically, it was unclear whether the labiovelar row represented an innovation by a process of labialisation, or whether it was inherited from the parent language ; current mainstream opinion favours the latter possibility. Labiovelars as single phonemes as opposed to biphonemes are attested in Greek, Italic, Germanic and Celtic . The boukólos rule, however, states that a labiovelar reduces to a plain velar when it occurs next to *u or *w.
The centum–satem division refers to the development of the dorsal series of sounds only at the time of the earliest separation of PIE into the proto-languages of its individual daughter branches; it does not apply to any later analogous developments within any branch. For example, the palatalization of Latin to or in some Romance languages is satem-like, as is the merger of *kʷ with *k in the Gaelic languages; such later changes do not affect the classification of the languages as centum.
Linguist Wolfgang P. Schmid argued that some proto-languages like Proto-Baltic were initially centum, but gradually became satem due to their exposure to the latter.

Satem languages

The satem languages belong to the Eastern sub-families, especially Indo-Iranian and Balto-Slavic, with Indo-Iranian being the major Asian branch and Balto-Slavic the major Eurasian branch of the satem group. It lost the labial element of PIE labiovelars and merged them with plain velars, but the palatovelars remained distinct and typically came to be realised as sibilants. That set of developments, particularly the assibilation of palatovelars, is referred to as satemisation.
In the satem languages, the reflexes of the presumed PIE palatovelars are typically fricative or affricate consonants, articulated further forward in the mouth. For example, the PIE root ḱm̥tóm, "hundred", the initial palatovelar normally became a sibilant or , as in Avestan satem, Persian sad, Sanskrit śatam, sto in all modern Slavic languages, Old Church Slavonic sъto, Latvian simts, Lithuanian šimtas. Another example is the Slavic prefix sъ-, which appears in Latin, a centum language, as co-; conjoin is cognate with Russian soyuz. An is found for PIE *ḱ in such languages as Latvian, Avestan, Russian and Armenian, but Lithuanian and Sanskrit have . For more reflexes, see the [|phonetic correspondences] section below; note also the effect of the ruki sound law.
"Incomplete satemisation" may also be evidenced by remnants of labial elements from labiovelars in Balto-Slavic, including Lithuanian ungurys "eel" < angʷi- and dygus "pointy" < dʰeigʷ-. A few examples are also claimed in Indo-Iranian, such as Sanskrit guru "heavy" < gʷer-, kulam "herd" < kʷel-, but they may instead be secondary developments, as in the case of kuru "make" < kʷer- in which it is clear that the ku- group arose in post-Rigvedic language. It is also asserted that in Sanskrit and Balto-Slavic, in some environments, resonant consonants become /iR/ after plain velars but /uR/ after labiovelars.
Some linguists argue that the Albanian and Armenian branches are also to be classified as satem, whereas other linguists argue that they show evidence of separate treatment of all three dorsal consonant rows and so may not have merged the labiovelars with the plain velars, unlike the canonical satem branches.
Assibilation of velars in certain phonetic environments is a common phenomenon in language development. Consequently, it is sometimes hard to establish firmly the languages that were part of the original satem diffusion and the ones affected by secondary assibilation later. While extensive documentation of Latin and Old Swedish, for example, shows that the assibilation found in French and Swedish were later developments, there are not enough records of the extinct Dacian and Thracian languages to settle conclusively when their satem-like features originated.
In Armenian, some assert that /kʷ/ is distinguishable from /k/ before front vowels. Martin Macak asserts that the merger of *kʷ and *k occurred "within the history of Proto-Armenian itself".
In Albanian, the three original dorsal rows have remained distinguishable when before historic front vowels. Labiovelars are for the most part differentiated from all other Indo-European velar series before front vowels, but they merge with the "pure" velars elsewhere. The palatal velar series, consisting of PIE *ḱ and the merged *ģ and ģʰ, usually developed into th and dh, but were depalatalized to merge with the back velars when in contact with sonorants. Because the original PIE tripartite distinction between dorsals is preserved in such reflexes, Demiraj argues Albanian is therefore to be considered, like Luwian, neither centum nor satem but at the same time it has a "satem-like" realization of the palatal dorsals in most cases. Thus PIE *ḱ, *kʷ and *k become th, s, and q, respectively.

History of the concept

Schleicher's single guttural series

, an early Indo-Europeanist, in Part I, "Phonology", of his major work, the 1871 Compendium of Comparative Grammar of the Indogermanic Language, published a table of original momentane Laute, or "stops", which has only a single velar series, *k, *g, *gʰ, under the name of Gutturalen. He identifies four palatals but hypothesises that they came from the gutturals along with the nasal *ń and the spirant *ç.

Brugmann's labialized and unlabialized language groups

, in his 1886 work Grundriß der vergleichenden Grammatik der indogermanischen Sprachen, promotes the palatals to the original language, recognising two stores of Explosivae, or "stops", the palatal and the velar each of which was simplified to three articulations even in the same work. In the same work, Brugmann notices among die velaren Verschlusslaute, "the velar stops", a major contrast between reflexes of the same words in different daughter languages. In some, the velar is marked with a "u-articulation", which he terms a "labialization", in accordance with the prevailing theory that the labiovelars were velars labialised by combination with a u at some later time and were not among the original consonants. He thus divides languages into "the language group with labialization" and "the language group without labialization", which basically correspond to what would later be termed the centum and satem groups:
The doubt introduced in that passage suggests he already suspected the "afterclap" u was not that but was part of an original sound.