Gemination


In phonetics and phonology, gemination, or consonant lengthening, is an articulation of a consonant for a longer period of time than that of a singleton consonant. It is distinct from stress. Gemination is represented in many writing systems by a doubled letter and is often perceived as a doubling of the consonant. Some phonological theories use 'doubling' as a synonym for gemination, while others describe two distinct phenomena.
Gemination can happen for various phonological or morphological reasons. The most common reason across languages is consonant assimilation: when two different consonants sit next to each other, the first one can assimilate into the next one, effectively doubling the second consonant. For instance, when the negation prefix in- is added to a word beginning with a consonant, its n can assimilate into it, thus doubling it: legal → illegal instead of inlegal ; the l consonant at the beginning of 'legal' is geminated by assimilation of the in- prefix's n.
Consonant length is a distinctive feature in certain languages, such as Japanese. Other languages, such as Modern Greek, do not have word-internal phonemic consonant geminates.
Consonant gemination and vowel length are independent in languages like Arabic, Japanese, Hungarian, Malayalam, and Finnish; however, in languages like Italian, Norwegian, and Swedish, vowel length and consonant length are interdependent. For example, in Norwegian and Swedish, a geminated consonant is always preceded by a short vowel, while an ungeminated consonant is preceded by a long vowel. In Italian, a geminate is always preceded by a short vowel, but a long vowel precedes a short consonant only if the vowel is stressed.

Phonetics

Lengthened fricatives, nasals, laterals, approximants and trills are simply prolonged. In lengthened stops, the obstruction of the airway is prolonged, which delays release, and the closure is lengthened. That is, is pronounced, not *. In affricates, it is also the closure that is lengthened, not the fricative release. That is, is pronounced, not *.
In terms of consonant duration, Berber and Finnish are reported to have a 3-to-1 ratio, compared with around 2-to-1 in Japanese, Italian, and Turkish.

Phonology

Gemination of consonants is distinctive in some languages and then is subject to various phonological constraints that depend on the language.
In some languages, like Italian, Swedish, Faroese, Icelandic, and Luganda, consonant length and vowel length depend on each other. A short vowel within a stressed syllable almost always precedes a long consonant or a consonant cluster, and a long vowel must be followed by a short consonant. In Classical Arabic, a long vowel was lengthened even more before permanently-geminate consonants.
In other languages, such as Finnish, consonant length and vowel length are independent of each other. In Finnish, both are phonemic; taka 'back', takka 'fireplace' and taakka 'burden' are different, unrelated words. Finnish consonant length is also affected by consonant gradation. Another important phenomenon is sandhi, which produces long consonants at word boundaries when there is an archiphonemic glottal stop > otas se 'take it !'.
In addition, in some Finnish compound words, if the initial word ends in an e, the initial consonant of the following word is geminated: jätesäkki 'trash bag', tervetuloa 'welcome'. In certain cases, a v after a u is geminated by most people: ruuvi 'screw', vauva 'baby'. In the Tampere dialect, if a word receives gemination of v after u, the u is often deleted, and lauantai 'Saturday', for example, receives a medial v, which can in turn lead to deletion of u.
Distinctive consonant length is usually restricted to certain consonants and environments. There are very few languages that have initial consonant length; among those that do are Malay language|Pattani Malay], Chuukese, Moroccan Arabic, a few Romance languages such as Sicilian and Neapolitan, as well as many High Alemannic German dialects, such as that of Thurgovia. Some African languages, such as Setswana and Luganda, also have initial consonant length: it is very common in Luganda and indicates certain grammatical features. In colloquial Finnish and Italian, long consonants occur in specific instances as sandhi phenomena.
The difference between singleton and geminate consonants varies within and across languages. Sonorants show more distinct geminate-to-singleton ratios while sibilants have less distinct ratios. The bilabial and alveolar geminates are generally longer than velar ones.
The reverse of gemination reduces a long consonant to a short one, which is called degemination. It is a pattern in Baltic-Finnic consonant gradation that the strong grade form of the word is degeminated into a weak grade form of the word: taakka > taakan. As a historical restructuring at the phonemic level, word-internal long consonants degeminated in Western Romance languages: e.g. Spanish /ˈboka/ 'mouth' vs. Italian /ˈbokka/, both of which evolved from Latin /ˈbukka/.

Examples

Afroasiatic languages

Arabic

Written Arabic indicates gemination with a diacritic shaped like a lowercase Greek omega or a rounded Latin w, called the شَدَّة [shadda|]: ّ . Written above the consonant that is to be doubled, the is often used to disambiguate words that differ only in the doubling of a consonant where the word intended is not clear from the context. For example, in Arabic, Form I verbs and Form II verbs differ only in the doubling of the middle consonant of the triliteral root in the latter form, e. g., درس is a Form I verb meaning to study, whereas درّس is the corresponding Form II verb, with the middle consonant doubled, meaning to teach.

Berber

In Berber, each consonant has a geminate counterpart, and gemination is lexically contrastive. The distinction between single and geminate consonants is attested in medial position as well as in absolute initial and final positions.
  • 'say'
  • 'those in question'
  • 'earth, soil'
  • 'loss'
  • 'mouth'
  • 'mother'
  • 'hyena'
  • 'he was quiet'
  • 'pond, lake, oasis'
  • 'brown buzzard, hawk'
In addition to lexical geminates, Berber also has phonologically-derived and morphologically-derived geminates. Phonological alternations can surface by concatenation or by complete assimilation. Morphological alternations include imperfective gemination, with some Berber verbs forming their
imperfective stem by geminating one consonant in their perfective stem, as well as quantity alternations between singular and plural forms.

Hebrew

In Biblical Hebrew, all consonants except gutturals and can receive a dagesh ḥazak, a dot called placed inside the letter: it means the letter has been functionally geminated/doubled. This happens following the morphological / grammatical rules of gemination.
In Modern Hebrew:
  • in writing, the same rules of gemination apply, although the resultant dagesh ḥazak dots are only visible in pointed texts where diacritics are written: in texts for children or new immigrants to Israel, or in specialized texts like dictionaries and poetry. Like all Hebrew diacritics, they are still functionally present even when they are hidden in unpointed texts. As a result, they can still influence pronunciation.
  • in speech, vocal gemination itself is generally not pronounced. However, the rules of gemination, which determine when a dagesh ḥazak is placed inside a letter which is functionally doubled, still influence pronunciation in one respect: if a dagesh ḥazak is present in bet, ‎ kaf, and pe, it turns a fricative sound into a plosive sound ; in all other letters, it has no influence on the letter's pronunciation.
Example: according to the rules of the dagesh ḥazak, the definite article ha "the" causes the next letter to be functionally geminated: in hakol, the is geminated, and is thus pronounced k. On the other hand, the preposition be "in, with, by" does not cause the next letter to be geminated, so in bekhol "in all, by all", the is not geminated, and is thus pronounced kh. This illustrates how the rules of gemination influence pronunciation in modern Hebrew. The difference between the geminated and the ungeminated would not be visible in a regular, unpointed text, but the letters would still be pronounced differently.

Austronesian languages

in the Philippines, Micronesia, and Sulawesi are known to have geminate consonants.

Kavalan

The Formosan language Kavalan makes use of gemination to mark intensity, as in sukaw 'bad' vs. sukkaw 'very bad'.

Malay dialects

Word-initial gemination occurs in various Malay dialects, particularly those found on the east coast of the Malay Peninsula such as Kelantan-Pattani Malay and Terengganu Malay. Gemination in these dialects of Malay occurs for various purposes such as:
  • To form a shortened free variant of a word or phrase so that:
  • * buwi > 'give'
  • * ke darat > 'to/at/from the shore'
  • A replacement of reduplication for its various uses in Standard Malay so that:
  • * budak-budak > 'children'
  • * layang-layang > 'kite'

    Tuvaluan

The Polynesian language Tuvaluan allows for word-initial geminates, such as mmala 'overcooked'.

Indo-European languages

English

In English phonology, consonant length is not distinctive within root words. For instance, baggage is pronounced, not. However, phonetic gemination does occur marginally.
Gemination is found across words and across morphemes when the last consonant in a given word and the first consonant in the following word are the same fricative, nasal, or stop.
For instance:
  • b: subbasement
  • d: midday
  • f: life force
  • g: egg girl
  • k: bookkeeper
  • l: wholly
  • m: calm man or roommate or prime minister
  • n: evenness
  • p: lamppost
  • r: interregnum or fire road
  • s: misspell or this saddle
  • sh: fish shop
  • t: cat tail
  • th: both thighs
  • v: live voter
  • z: pays zero
With affricates, however, this does not occur. For instance:
  • orange juice
In most instances, the absence of this doubling does not affect the meaning, though it may confuse the listener momentarily. The following minimal pairs represent examples where the doubling does affect the meaning in most accents:
  • ten nails versus ten ales
  • this sin versus this inn
  • five valleys versus five alleys
  • his zone versus his own
  • mead day versus me-day
  • unnamed versus unaimed
  • forerunner versus foreigner
Note that whenever appears, non-rhotic dialects of English don't have the gemination, but rather lengthen the preceding vowel.
In some dialects gemination is also found for some words when the suffix -ly follows a root ending in /l/, as in:
  • solely
but not
  • usually
In some varieties of Welsh English, the process takes place indiscriminately between vowels, e.g. in money but it also applies with graphemic duplication, e.g. ''butter''

French

In French, gemination is usually not phonologically relevant and therefore does not allow words to be distinguished: it mostly corresponds to an accent of insistence, or meets hyper-correction criteria: one "corrects" one's pronunciation, despite the usual phonology, to be closer to a realization that one imagines to be more correct: thus, the word illusion is sometimes pronounced by influence of the spelling.
However, gemination is contrastive in a few cases. Some words, such as netteté, and verrerie, are generally pronounced with a silent e following the double consonant, resulting in a pronunciation that reflects the gemination. Statements such as elle a dit ~ elle l'a dit ~ can commonly be distinguished by gemination. In a more sustained pronunciation, gemination distinguishes the conditional from the imperfect: courrais 'would run' vs. courais 'ran' ; or the indicative from the subjunctive: croyons 'we believe' vs. croyions 'we believed'.

Greek

In Ancient Greek, consonant length was distinctive, e.g., μέλω 'I am of interest' vs. μέλλω 'I am going to'. The distinction has been lost in the standard and most other varieties, with the exception of Cypriot, some varieties of the southeastern Aegean, and Italy.

Hindustani

Gemination is common in both Hindi and Urdu. It does not occur after long vowels and is found in words of both Indic and Arabic origin, but not in those of Persian origin. In Urdu, gemination is represented by the Shadda diacritic, which is usually omitted from writings, and mainly written to clear ambiguity. In Hindi, gemination is represented by doubling the geminated consonant, enjoined with the Virama diacritic.
TransliterationHindiUrduMeaningEtymology
पत्ताپَتَّہ'leaf'Sanskrit
अब्बाاَبّا'father'Arabic
दज्जालدَجّال'anti-christ'Arabic
डब्बाڈَبَّہ'box'Sanskrit
जन्नतجَنَّت'heaven'Arabic
गद्दाگَدّا'mattress'Sanskrit
Aspirated consonants
Gemination of aspirated consonants in Hindi are formed by combining the corresponding non-aspirated consonant followed by its aspirated counterpart. In vocalised Urdu, the shadda is placed on the unaspirated consonant followed by the short vowel diacritic, followed by the do-cashmī hē, which aspirates the preceding consonant. There are few examples where an aspirated consonant is truly doubled.
TransliterationHindiUrduMeaning
पत्थरپَتَّھر'stone'
कत्थाکَتَّھاbrown spread on
गड्ढाگڑھا'pit'
मक्खीمَکِّھی'fly'

Italian

Italian is notable among the Romance languages for its extensive geminate consonants. In Standard Italian, word-internal geminates are usually written with two consonants, and geminates are distinctive. For example, bevve, meaning 'he/she drank', is phonemically and pronounced, while beve is, pronounced. Tonic syllables are bimoraic and are therefore composed of either a long vowel in an open syllable or a short vowel in a closed syllable. In varieties with post-vocalic weakening of some consonants, geminates are not affected.
Double or long consonants occur not only within words but also at word boundaries, and they are then pronounced but not necessarily written: chi + sa = chissà and vado a casa . All consonants except can be geminated. This word-initial gemination is triggered either lexically by the item preceding the lengthening consonant, or by any word-final stressed vowel.

Latin

In Latin, consonant length was distinctive, as in anus 'old woman' vs. annus 'year'. Vowel length was also distinctive in Latin until about the fourth century, and was often reflected in the orthography with an apex. Geminates inherited from Latin still exist in Italian, in which anno and ano contrast with regard to and as in Latin. It has been almost completely lost in French and completely in Romanian. In West Iberian languages, former Latin geminate consonants often evolved to new phonemes, including some instances of nasal vowels in Portuguese and Old Galician as well as most cases of and in Spanish, but with the possible exception of and in Spanish phonetic length of consonants and vowels is no longer distinctive.

Nepali

In Nepali, all consonants have geminate counterparts except for. Geminates occur only medially. Examples:
  • समान – 'equal' ; सम्मान – 'honour'
  • सता – 'disturb!' ; सत्ता – 'authority'
  • पका – 'cook!' ; पक्का – 'certain'

    Norwegian

In Norwegian, gemination is indicated in writing by double consonants. Gemination often differentiates between unrelated words. As in Italian, Norwegian uses short vowels before doubled consonants and long vowels before single consonants. There are qualitative differences between short and long vowels:
  • måte / måtte – 'method' / 'must'
  • lete / lette – 'to search' / 'to take off'
  • sine / sinne – 'theirs' / 'anger'

    Polish

A specific feature of Polish is the almost exclusive occurrence of true gemination. Doubled letters are pronounced with rearticulation as two separate sounds with short pause, this applies to both consonants and vowels. However, it is also possible to pronounce geminates as single sounds if this does not change the meaning. Geminates are typical 1.5-3 times longer than single tones. Rearticulated geminates they have the same length as single. Vowels before or after geminates do not differ in length from typical ones.
Examples:
  • wanna – 'bathtub'
  • Anna
  • horror – 'horror'
  • hobby or – 'hobby'
Consonant length is distinctive and sometimes is necessary to distinguish words:
  • rodziny – 'families'; rodzinny – 'familial'
  • saki – 'sacks, bags'; ssaki – 'mammals',
  • leki – 'medicines'; lekki – 'light, lightweight'
Double consonants are common on morpheme borders where the initial or final sound of the suffix is the same as the final or initial sound of the stem, after devoicing. Examples:
  • przedtem – 'before, previously'; from przed + tem
  • oddać – 'give back'; from od + dać
  • bagienny – 'swampy'; from bagno + ny
  • najjaśniejszy – 'brightest'; from naj + jaśniejszy

    Punjabi

is written in two scripts, namely, Gurmukhi and Shahmukhi. Both scripts indicate gemination through the uses of diacritics. In Gurmukhi the diacritic is called the [ੱ|] which is written before the geminated consonant and is mandatory. In contrast, the shadda, which is used to represent gemination in the Shahmukhi script, is not necessarily written, retaining the tradition of the original Arabic script and Persian language, where diacritics are usually omitted from writing, except to clear ambiguity, and is written above the geminated consonant. In the cases of aspirated consonants in the Shahmukhi script, the shadda remains on the consonant, not on the do-cashmī he.
Gemination is specially characteristic of Punjabi compared to other Indo-Aryan languages like Hindi-Urdu, where instead of the presence of consonant lengthening, the preceding vowel tends to be lengthened. Consonant length is distinctive in Punjabi, for example:

Russian

In Russian, consonant length may occur in several situations.
Minimal pairs exist, such as 'to hold' vs 'to support', and their conjugations, or 'length' vs 'long' adj. f.
  • Word formation or conjugation: длина > длинный This occurs when two adjacent morphemes have the same consonant and is comparable to the situation of Polish described above.
  • Assimilation. The spelling usually reflects the unassimilated consonants, but they are pronounced as a single long consonant.
  • *высший.

    Spanish

There are phonetic geminate consonants in Caribbean Spanish due to the assimilation of /l/ and /ɾ/ in syllabic coda to the following consonant. Examples of Cuban Spanish:

Luganda

is unusual in that gemination can occur word-initially, as well as word-medially. For example, kkapa 'cat', jjajja 'grandfather' and nnyabo 'madam' all begin with geminate consonants.
There are three consonants that cannot be geminated:, and. Whenever morphological rules would geminate these consonants, and are prefixed with, and changes to. For example:
  • -ye 'army' > ggye 'an army'
  • -yinja 'stone' > jjinja 'a stone' ; jj is usually spelt ggy
  • -wanga 'nation' > ggwanga 'a nation'
  • -lagala 'medicine' > ddagala 'medicine'

    Japanese

In Japanese, consonant length is distinctive. Gemination in the syllabary is represented with the sokuon, a small tsu: っ for hiragana in native words and ッ for katakana in foreign words. For example, 来た means 'came; arrived', while 切った means 'cut; sliced'. With the influx of gairaigo into Modern Japanese, voiced consonants have become able to geminate as well: バグ means ' bug', and バッグ means 'bag'. Distinction between voiceless gemination and voiced gemination is visible in pairs of words such as キット and キッド. In addition, in some variants of colloquial Modern Japanese, gemination may be applied to some adjectives and adverbs in order to add emphasis: すごい contrasts with すっごい ; 思い切り contrasts with 思いっ切り.

Turkic languages

Turkish

In Turkish gemination is indicated by two identical letters as in most languages that have phonemic gemination.
  • anne "mother"
  • hürriyet "freedom"
Loanwords originally ending with a phonemic geminated consonant are always written and pronounced without the ending gemination as in Arabic.
  • hac
  • hat
Although gemination is resurrected when the word takes a suffix.
  • hac becomes hacca when it takes the suffix "-a"
  • hat becomes hattın when it takes the suffix "-ın"
Gemination also occurs when a suffix starting with a consonant comes after a word that ends with the same consonant.
  • el + -ler = eller .
  • at + -tık = attık .

    Dravidian languages

Malayalam

In Malayalam, compounding is phonologically conditioned called as sandhi and gemination occurs at word boundaries. Gemination sandhi is called dvitva sandhi or 'doubling sandhi'.
Consider following example:
  • മേശ + പെട്ടി – മേശപ്പെട്ടി
Gemination also occurs in a single morpheme like കള്ളം which has a different meaning from കളം.

Tamil

In Tamil, "otru" can occur when two words combine in a certain meaning. This otru is generally one of க், ச், த் and ப், and occurs when one of these four consonants is the first letter of the second word. For example:
  • மர + கட்டை – மரக்கட்டை
  • மர + சால் – மரச்சால்
  • மர + தடி – மரத்தடி
  • மர + பெட்டி – மரப்பெட்டி
Gemination also occurs in nouns that end in டு and று when they are part of a grammatical case. For example:
  • வீடு + வரி => வீட்டுவரி
  • சேறு + வயல் => சேற்றுவயல்
Gemination also occurs in a single morpheme like கள்ளம் which has a different meaning from களம். More examples where words without and with gemination consonants have different meanings:
  • கறை - காற்றை
  • தடை - தட்டை
  • தொலை - தொல்லை

    Uralic languages

Sámi languages

Many Sámi languages have gemination as a phonetic feature. The Proto-Sami language had as many as four different lengths, although there is only one living language where this is attested: certains dialect of Ume Sámi. Most varieties have merged them to two or three contrastive degrees of length.

Estonian

has three phonemic lengths; however, the third length is a suprasegmental feature, which is as much tonal patterning as a length distinction. It is traceable to allophony caused by now-deleted suffixes, for example half-long linna < *linnan 'of the city' vs. overlong linna < *linnaan < *linnahen 'to the city'.

Finnish

Consonant length is phonemic in Finnish, for example takka and taka . Consonant gemination occurs with simple consonants and between syllables in the pattern -vowel-sonorant-stop-stop-vowel but not generally in codas or with longer syllables. Sandhi often produces geminates.
Both consonant and vowel gemination are phonemic, and both occur independently, e.g. Mali, maali, malli, maallinen.
In Standard Finnish, consonant gemination of exists only in interjections, new loan words and in the playful word hihhuli, with its origins in the 19th century, and derivatives of that word.
In many Finnish dialects there are also the following types of special gemination in connection with long vowels: the southwestern special gemination, with lengthening of stops + shortening of long vowel, of the type leipää < leippä; the common gemination, with lengthening of all consonants in short, stressed syllables, of the type putoaa > puttoo and its extension ; the eastern dialectal special gemination, which is the same as the common gemination but also applies to unstressed syllables and certain clusters, of the types lehmiä > lehmmii and maksetaan > maksettaan.

Wagiman

In Wagiman, an indigenous Australian language, consonant length in stops is the primary phonetic feature that differentiates fortis and lenis stops. Wagiman does not have phonetic voice. Word-initial and word-final stops never contrast for length.

Writing

In written language, consonant length is often indicated by writing a consonant twice, but can also be indicated with a special symbol, such as the shadda in Arabic, the dagesh in Classical Hebrew, or the sokuon in Japanese.
In the International Phonetic Alphabet, long consonants are normally written using the triangular colon , e.g. penne , though doubled letters are also used.
  • Catalan uses the raised dot to distinguish a geminated l from a palatal ll. Thus, paral·lel and Ramon Llull.
  • Estonian uses b, d, g for short consonants, and p, t, k and pp, tt, kk are used for long consonants.
  • Hungarian digraphs and trigraphs are geminated by doubling the first letter only, thus the geminate form of sz is ssz , and that of dzs is ddzs.
  • The only digraph in Ganda, ny is doubled in the same way: nny.
  • In Italian, geminated instances of the sound cluster are always indicated by writing cq, except in the words soqquadro and beqquadro, where the letter q is doubled. The gemination of sounds, and,, and gl is not indicated because these consonants are always geminated when occurring between vowels. Also the sounds, are always geminated when occurring between vowels, yet their gemination is sometimes shown, redundantly, by doubling the z as, e.g., in pizza.
  • In Japanese, non-nasal gemination is denoted by placing the "small" variant of the syllable Tsu between two syllables, where the end syllable must begin with a consonant. For nasal gemination, precede the syllable with the letter for mora N. The script of these symbols must match with the surrounding syllables.
  • In Swedish and Norwegian, the general rule is that a geminated consonant is written double, unless succeeded by another consonant. Hence hall, but halt. In Swedish, this does not apply to morphological changes ]. The exception are some words ending in -m, thus hem and stam , but lamm 'lamb', to distinguish the word from lam, as well as adjectives in -nn, so tunn, 'thin' but tunt, 'thinly' (while Norwegian has a rule always prohibiting two "m"s at the end of a word.

    Double letters that are not long consonants

Doubled orthographic consonants do not always indicate a long phonetic consonant.