Orthographic depth

The orthographic depth of an alphabetic orthography indicates the degree to which a written language deviates from simple one-to-one letter–phoneme correspondence. It depends on how easy it is to predict the pronunciation of a word based on its spelling: shallow orthographies are easy to pronounce based on the written word, and deep orthographies are difficult to pronounce based on how they are written.
In shallow orthographies, the spelling-sound correspondence is direct: from the rules of pronunciation, one is able to pronounce the word correctly. That is to say, shallow orthographies, also called phonemic orthographies, have a one-to-one relationship between its graphemes and phonemes, and the spelling of words is very consistent. Examples include Japanese kana, Hindi, Lao, Spanish, Finnish, Turkish, Georgian, Latin, Italian, Serbo-Croatian, Ukrainian, and Welsh.
In contrast, in deep orthographies, the relationship is less direct, and the reader must learn the arbitrary or unusual pronunciations of irregular words. Deep orthographies are writing systems that do not have a consistent one-to-one correspondence between sounds and the letters or characters that represent them. Instead, spellings tend to reflect etymology and/or historic pronunciation. Examples include English, Danish, Swedish, Faroese, Chinese, Tibetan, Mongolian, Thai, Khmer, Burmese, Lao, French, and Franco-Provençal.
Orthographies such as those of German, Hungarian, Portuguese, modern Greek, Icelandic, Korean, Tamil, and Russian are considered to be of intermediate depth as they include many morphophonemic features.

By language

represents an unusual hybrid; each phoneme in the language is represented by a letter but the letters are packaged into "square" units of two to four phonemes, each square representing a syllable. Korean has very complex phonological variation rules, especially regarding the consonants rather than the vowels, in contrast to English. For example, the Korean word 훗일, which should be pronounced as based on standard pronunciations of the components of the grapheme, is actually pronounced as. Among the consonants of the Korean language, only one is always pronounced exactly as it is written.
Italian, which is a shallow language overall, offers clear examples of differential directionality in depth. Even in a very shallow orthographic system, spelling-to-pronunciation and pronunciation-to-spelling may not be equally clear. There are two major imperfect matches of vowels to letters: in stressed syllables, e can represent either open or closed, and o stands for either open or closed. According to the orthographic principles used for the language, 'sect', for example, with open can be spelled only setta, and 'summit' with closed can be only vetta — if a listener can hear it, he can spell it. But since the letter e is assigned to represent both and, there is no principled way to know whether to pronounce the written words setta and vetta with or — the spelling does not present the information needed for accurate pronunciation. A second lacuna in Italian's shallow orthography is that, although stress position in words is only very partially predictable, it is normally not indicated in writing. For purposes of spelling, it makes no difference which syllable is stressed in the place names Arsoli and Carsoli, but the spellings offer no clue that they are ARsoli and CarSOli.
English is unusual because it combines deep orthography, with multiple possible sounds for many letters. This makes it among the most difficult languages in the world to learn to read. For example, the digraph ea is pronounced differently in words like beat and head, with no visual indication of which sound is intended in which word.

Orthographic depth hypothesis

According to the orthographic depth hypothesis, shallow orthographies are more easily able to support a word recognition process that involves the language phonology. In contrast, deep orthographies encourage a reader to process printed words by referring to their morphology via the printed word's visual-orthographic structure. For languages with relatively deep orthographies such as English, French, unvocalised Arabic or Hebrew, new readers have much more difficulty learning to decode words. As a result, children learn to read more slowly. For languages with relatively shallow orthographies, such as Italian and Finnish, new readers have few problems learning to decode words. As a result, children learn to read relatively quickly.
Van den Bosch et al. consider orthographic depth to be the composition of at least two separate components. One of these relates to the complexity of the relations between the elements at the graphemic level to those at the phonemic level, i.e., how difficult it is to convert graphemic strings to phonemic strings. The second component is related to the diversity at the graphemic level, and to the complexity of determining the graphemic elements of a word, i.e., how to align a phonemic transcription to its spelling counterpart.
In 2021, Xavier Marjou used an artificial neural network to rank 17 orthographies according to their level of transparency. Among the tested orthographies, Chinese and French orthographies, followed by English and Russian, are the most opaque regarding writing and English, followed by Dutch, is the most opaque regarding reading ; Esperanto, Arabic, Finnish, Korean, Serbo-Croatian and Turkish are very shallow both to read and to write; Italian is shallow to read and very shallow to write; Breton, German, Portuguese and Spanish are shallow to read and to write.

Relationship to reading acquisition

Children who learn to read in languages with shallow orthographies, such as Spanish, learn to read faster and at a younger age. People with dyslexia may read more slowly in shallow languages, but their word accuracy is as good as non-dyslexic readers in that language.
Children who learn to read in a language with opaque orthographies, such as English, learn to read at an older age. People with dyslexia both read more slowly in opaque languages and also make mistakes.