Digraph (orthography)
A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.
Some digraphs represent phonemes that cannot be represented with a single character in the writing system of a language, like in Spanish chico and ocho. Other digraphs represent phonemes that can also be represented by single characters. A digraph that shares its pronunciation with a single character may be a relic from an earlier period of the language when the digraph had a different pronunciation, or may represent a distinction that is made only in certain dialects, like the English. Some such digraphs are used for purely etymological reasons, like in French.
In some orthographies, a digraph is considered to constitute a letter, which means that it has its own place in the alphabet and cannot be separated into its constituent graphemes for purposes of sorting, abbreviating, or hyphenating words. Digraphs are used in some romanization schemes, e.g. as a romanisation of Russian.
The capitalisation of digraphs can vary, e.g. in Polish is capitalized, and in Norwegian is capitalized, while in Dutch is capitalized and word initial in Irish is capitalized.
Digraphs may also develop into ligatures, but the two concepts are distinct; a digraph's essential feature is its sound, while a ligature is visual, graphically fusing two characters into one, e.g. when and become, e.g. as in French cœur "heart".
Homogeneous digraph
Digraphs may consist of two different characters or two instances of the same character. In the latter case, they are generally called double ''letters.Doubled vowel letters are commonly used to indicate a long vowel sound. This is the case in Finnish and Estonian, for instance, where represents a longer version of the vowel denoted by, represents a longer version of the vowel denoted by, and so on. In Middle English, the sequences and were used in a similar way, to represent lengthened "e" and "o" sounds respectively; both spellings have been retained in modern English orthography, but the Great Vowel Shift and other historical sound changes mean that the modern pronunciations are quite different from the original ones.
Doubled consonant letters can also be used to indicate a long or geminated consonant sound. In Italian, for example, consonants written double are pronounced longer than single ones. This was the original use of doubled consonant letters in Old English, but during the Middle English and Early Modern English period, phonemic consonant length was lost and a spelling convention developed in which a doubled consonant serves to indicate that a preceding vowel is to be pronounced short. In modern English, for example, the of tapping differentiates the first vowel sound from that of taping. In rare cases, doubled consonant letters represent a true geminate consonant in modern English; this may occur when two instances of the same consonant come from different morphemes, for example in unnatural or in cattail''.
In some cases, the sound represented by a doubled consonant letter is distinguished in some other way than length from the sound of the corresponding single consonant letter:
- In Welsh and Greenlandic, stands for a voiceless lateral consonant, while in Spanish and Catalan it stands for a palatal consonant.
- In several languages of western Europe, including English, French, Portuguese and Catalan, the digraph is used between vowels to represent the voiceless sibilant, since an alone between vowels normally represents the voiced sibilant.
- In Spanish, Portuguese, Catalan and Basque, rr | is used between vowels for the alveolar trill, since an alone between vowels represents an alveolar flap .
- In Spanish, the digraph formerly indicated ; it developed into the letter ñ.
- In Basque, double consonant letters generally mark palatalized versions of the single consonant letter, as in dd |,, tt |. However, is a trill that contrasts with the single-letter flap, as in Spanish, and the palatal version of is written.
Pan-dialectical digraphs
Some languages have a unified orthography with digraphs that represent distinct pronunciations in different dialects. For example, in Breton there is a digraph that represents in most dialects, but in Vannetais. Similarly, the Saintongeais dialect of French has a digraph that represents in words that correspond to in standard French. Similarly, Catalan has a digraph that represents in Eastern Catalan, but or in Western Catalan–Valencian.Split digraphs
The pair of letters making up a phoneme are not always adjacent. This is the case with English silent e. For example, the sequence a_e has the sound in English cake. This is the result of three historical sound changes: cake was originally, the open syllable came to be pronounced with a long vowel, and later the final schwa dropped off, leaving. Later still, the vowel became. There are six such digraphs in English,.However, alphabets may also be designed with discontinuous digraphs. In the Tatar Cyrillic alphabet, for example, the letter ю is used to write both and. Usually the difference is evident from the rest of the word, but when it is not, the sequence ю...ь is used for, as in юнь 'cheap'.
The Indic alphabets are distinctive for their discontinuous vowels, such as Thai เ...อ in เกอ. Technically, however, they may be considered diacritics, not full letters; whether they are digraphs is thus a matter of definition.
Ambiguous letter sequences
Some letter pairs are not digraphs but might be interpreted as digraphs because of compounding: e.g. hogshead and cooperate. In English, they are often unmarked and must therefore be memorized, or more likely deduced, as exceptions. Some authors, however, indicate it either by breaking up the digraph with a hyphen, as in hogs-head, co-operate, or, in case of a vowel hiatus, with a diaeresis diacritic mark, as in coöperate. When it occurs in names such as Clapham, Townshend, and Hartshorne, it is never marked in any way. Positional alternative glyphs may help to disambiguate in certain cases: when round was used as a final variant of long, and the English digraph for would always be.Similar ambiguity also occurs frequently in German, where it is also unmarked and left to the reader to deduce.
In Romansh, a hyphen is used to distinguish from.
In Dutch, a diaresis is frequently used to parse.
In romanization of Japanese, the constituent sounds are usually indicated by digraphs, but some are indicated by a single letter, and some with a trigraph. The case of ambiguity is the syllabic ん , which is written as n, except before vowels or y where it is followed by an apostrophe as n’. For example, the given name じゅんいちろう is romanized as Jun’ichirō, so that it is parsed as "Ju-n-i-chi-rou", rather than as "Ju-ni-chi-rou". A similar use of the apostrophe is seen in pinyin where 嫦娥 is written Chang'e because the g belongs to the final of the first syllable, not to the initial of the second syllable. Without the apostrophe, Change would be understood as the syllable chan followed by the syllable ge.
In alphabetization
In some languages, certain digraphs and trigraphs are counted as distinct letters in themselves, and assigned to a specific place in the alphabet, separate from that of the sequence of characters that composes them, for purposes of orthography and collation:- In the Gaj's Latin alphabet used to write Serbo-Croatian, the digraphs, and, which correspond to the single Cyrillic letters,,, are treated as distinct letters.
- In the Czech and Slovak alphabet, is treated as a distinct letter, coming after in the alphabet. Also, in the Slovak alphabet the relatively rare digraphs and are treated as distinct letters.
- In the Danish and Norwegian alphabet, the former digraph, where it appears in older names, is sorted as if it were the letter, which replaced it.
- In the Norwegian alphabet, there are several digraphs and letter combinations representing an isolated sound.
- In the Dutch alphabet, the digraph is sometimes written as a ligature and may be sorted with ; however, regardless of where it is used, when a Dutch word starting with is capitalized, the entire digraph is capitalized. Other [|Dutch digraphs] are never treated as single letters.
- In Hungarian, the digraphs,,,,,,,, and the trigraph, have their own places in the alphabet
- In Spanish, the digraphs and were formerly treated as distinct letters, but are now split into their constituent letters.
- In Welsh, the alphabet includes the digraphs,,,,,,,. However,, and, which represent mutated voiceless consonants, are not treated as distinct letters.
- In the romanization of several Slavic countries that use the Cyrillic script, letters like ш, ж, and ю might be written as sh, zh and yu, however sometimes the result of the romanization might modify a letter to be a diacritical letter instead of a digraph.
- In Maltese, two digraphs are used, which comes right after, and which comes right after.
Examples
Latin script
English
English has both homogeneous digraphs and heterogeneous digraphs. Those of the latter type include the following:- normally represents or before or.
- can represent as in thing.
- usually corresponds to , to when used as an etymological digraph in words of Greek origin, less commonly to in words of French origin.
- corresponds to as in check.
- represents at the beginning of words, represents or is silent at the end of words.
- represents , as in siphon.
- represents English in words of Greek origin, such as rhythm.
- represents , as in sheep.
- usually represents word-medially before a vowel, as in education.
- usually corresponds to in thin or in then. See also Pronunciation of English.
- represents in some conservative dialects; in other dialects ; and in a few words in which it is followed by, such as who and whole. See also Phonological history of.
- represents in words transliterated from Slavic languages, and in American dictionary pronunciation spelling.
- usually appears as before vowels, like in facial and artificial. Otherwise it is as in fancier and icier or as in acid and rancid.
- represents. Originally, it stood for a labialized sound, while without was non-labialized, but the distinction has been lost in most dialects, the two sounds merging into a single alveolar approximant, allophonically labialized at the start of syllables, as in red. See also rhotic consonant.
- usually represents ; is conventionally followed by and a vowel letter as in quick, with some exceptions.
| second letter → first letter ↓ | ¦ | ¦ | |||
| > – | – | – | – | – | |
| > – | – | – | |||
| – | – | – | – | - | |
| – | – | - | - | - | |
| – | - | - | - | - |