English orthography


English orthography comprises the set of rules used when writing the English language, allowing readers and writers to associate written graphemes with the sounds of spoken English, as well as other features of the language. English's orthography includes norms for spelling, hyphenation, capitalisation, word breaks, emphasis, and punctuation.
As with the orthographies of most other world languages, written English is broadly standardised. This standardisation began to develop when movable type spread to England in the late 15th century. However, unlike with most languages, there are multiple ways to spell every phoneme, and most letters also represent multiple pronunciations depending on their position in a word and the context.
This is partly due to the large number of words that have been loaned from a large number of other languages throughout the history of English, without successful attempts at complete spelling reforms, and partly due to accidents of history, such as some of the earliest mass-produced English publications being typeset by highly trained, multilingual printing compositors, who occasionally used a spelling pattern more typical for another language. For example, the word ghost was spelled gost in Middle English, until the Flemish spelling pattern was unintentionally substituted, and happened to be accepted. Most of the spelling conventions in Modern English were derived from the phonemic spelling of a variety of Middle English, and generally do not reflect the sound changes that have occurred since the late 15th century.
Despite the various English dialects spoken from country to country and within different regions of the same country, there are only slight regional variations in English orthography, the two most recognised variations being British and American spelling, and its overall uniformity helps facilitate international communication. On the other hand, it also adds to the discrepancy between the way English is written and spoken in any given location.

Function of letters

Phonemic representation

in English orthography positioned at one location
within a specific word usually represent a particular phoneme. For example, at consists of 2 letters and, which represent and, respectively.
Sequences of letters may perform this role as well as single letters. Thus, in thrash, the digraph represents. In hatch, the trigraph represents.
Less commonly, a single letter can represent multiple successive sounds. The most common example is, which normally represents the consonant cluster .
The same letter may be pronounced differently when occurring in different positions within a word. For instance, represents at the end of some words but not in others. At the beginning of syllables, is pronounced, as in ghost. Conversely, is never pronounced in syllable onsets other than in inflected forms, and is almost never pronounced in syllable codas.
Some words contain silent letters, which do not represent any sound in modern English pronunciation. Examples include the in talk, half, calf, etc., the in two and sword, as mentioned above in numerous words such as though, daughter, night, brought, and the commonly encountered silent .

Word origin

Another type of spelling characteristic is related to word origin. For example, when representing a vowel, represents the sound in some words borrowed from Greek, whereas the letter usually representing this sound in non-Greek words is the letter. Thus, myth is of Greek origin, while pith is a Germanic word. However, a large number of Germanic words have in word-final position.
Some other examples are pronounced , and pronounced . The use of these spellings for these sounds often marks words that have been borrowed from Greek.
Some researchers, such as Brengelman, have suggested that, in addition to this marking of word origin, these spellings indicate a more formal level of style or register in a given text, although Rollings finds this point to be exaggerated as there would be many exceptions where a word with one of these spellings, such as for , could occur in an informal text.

Homophone differentiation

Spelling may also be useful to distinguish in written language between homophones, and thus resolve potential ambiguities that would arise otherwise. However in most cases the reason for the difference is historical, and it was not introduced to resolve ambiguity.
;Examples
  • heir and air are pronounced identically in most dialects, but spelled differently.
  • pain and pane are both pronounced but have two different spellings of the vowel. This arose because the two words were originally pronounced differently: pain used to be pronounced as, with a diphthong, and pane as, but the diphthong merged with the long vowel in pane, making pain and pane homophones. Later became a diphthong.
  • break and brake:.
Nevertheless, many homophones remain that are unresolved by spelling.

Marking sound changes in other letters

Some letters in English provide information about the pronunciation of other letters in the word. Rollings uses the term "markers" for such letters. Letters may mark different types of information.
often marks an altered pronunciation of a preceding vowel. In the pair mat and mate, the of mat has the value, whereas the of mate is marked by the as having the value. In this context, the is not pronounced, and is referred to as a "silent e".
Also, in once indicates that the preceding is pronounced, rather than the more common value of in word-final position as the sound, such as in attic.
A single letter may even fill multiple pronunciation-marking roles simultaneously. For example, in the word ace, marks not only the change of from to, but also of from to. In the word vague, marks the long sound, but keeps the hard rather than soft.
Doubled consonants usually indicate that the preceding vowel is pronounced short. For example, the doubled in batted indicates that the is pronounced, while the single of bated gives. Doubled consonants only indicate any lengthening or gemination of the consonant sound itself when they come from different morphemes, as with the in unnamed.

Multiple functionality

Any given letters may have dual functions. For example, in statue has a sound-representing function and a pronunciation-marking function.

Underlying representation

Like many other alphabetic orthographies, English spelling does not represent non-contrastive phonetic sounds.
Although the letter is pronounced by most speakers with aspiration at the beginning of words, this is never indicated in the spelling, and, indeed, this phonetic detail is probably not noticeable to the average native speaker not trained in phonetics.
However, unlike some orthographies, English orthography often represents a very abstract underlying representation of English words.
In these cases, a given morpheme has a fixed spelling even though it is pronounced differently in different words. An example is the past tense suffix -, which may be pronounced variously as,, or . As it happens, these different pronunciations of - can be predicted by a few phonological rules, but that is not the reason why its spelling is fixed.
Another example involves the vowel differences in several related words. For instance, photographer is derived from photograph by adding the derivational suffix -. When this suffix is added, the vowel pronunciations change largely owing to the moveable stress:
SpellingPronunciation
photograph or
photographer
photographical

Other examples of this type are the - suffix. See also: Trisyllabic laxing.
Another example includes words like mean and meant, where is pronounced differently in the two related words. Thus, again, the orthography uses only a single spelling that corresponds to the single morphemic form rather than to the surface phonological form.
English orthography does not always provide an underlying representation; sometimes it provides an intermediate representation between the underlying form and the surface pronunciation. This is the case with the spelling of the regular plural morpheme, which is written as either - or -. Here, the spelling - is pronounced either or while - is usually pronounced . Thus, there are two different spellings that correspond to the single underlying representation || of the plural suffix and the three surface forms. The spelling indicates the insertion of before the in the spelling -, but does not indicate the devoiced distinctly from the unaffected in the spelling -.
The abstract representation of words as indicated by the orthography can be considered advantageous since it makes etymological relationships more apparent to English readers. This makes writing English more complex, but arguably makes reading English more efficient. However, very abstract underlying representations, such as that of Chomsky & Halle or of underspecification theories, are sometimes considered too abstract to accurately reflect the communicative competence of native speakers. Followers of these arguments believe the less abstract surface forms are more "psychologically real" and thus more useful in terms of pedagogy.

Diacritics

Some English words can be written with diacritics; these are mostly loanwords, usually from French. As vocabulary becomes naturalised, there is an increasing tendency to omit the accent marks, even in formal writing. For example, rôle and hôtel originally had accents when they were borrowed into English, but now the accents are almost never used. The words were originally considered foreign—and some people considered that English alternatives were preferable—but today their foreign origin is largely forgotten. Words most likely to retain the accent are those atypical of English morphology and therefore still perceived as slightly foreign. For example, café and pâté both have a pronounced final, which would otherwise be silent under the normal English pronunciation rules. Moreover, in pâté, the acute accent is helpful to distinguish it from pate.
Further examples of words sometimes retaining diacritics when used in English are: ångström—partly because its symbol is —appliqué, attaché, blasé, bric-à-brac, Brötchen, cliché, crème, crêpe, fiancé, flambé, jalapeño, naïve, naïveté, , papier-mâché, passé, piñata, protégé, résumé, risqué, and voilà. Italics, with appropriate accents, are generally applied to foreign terms that are uncommonly used in or have not been assimilated into English: for example, ', belles-lettres, crème brûlée, ', ', and '.
It was formerly common in American English to use a diaeresis to indicate a hiatus, e.g. coöperate, daïs, and reëlect. The New Yorker and Technology Review magazines still use it for this purpose, even as general use became much rarer. Instead, modern orthography generally prefers no mark or a hyphen for a hiatus between two morphemes in a compound word. By contrast, use of diaereses in monomorphemic loanwords such as naïve and Noël remains relatively common.
In poetry and performance arts, accent marks are occasionally used to indicate typically unstressed syllables that should be stressed when read for dramatic or prosodic effect. This is frequently seen with the -ed suffix in archaic and pseudoarchaic writing, e.g. cursèd indicates the should be fully pronounced. The grave being to indicate that an ordinarily silent or elided syllable is pronounced.