Standard Arabic phonology
While many languages have numerous dialects that differ in phonology, contemporary spoken Arabic is more properly described as a continuum of varieties. This article deals primarily with Modern Standard Arabic, which is the standard variety shared by educated speakers throughout Arabic-speaking regions. MSA is used in writing in formal print media and orally in newscasts, speeches and formal declarations of numerous types.
Modern Standard Arabic has 28 consonant phonemes and 6 vowel phonemes, with four "emphatic" consonants that contrast with their non-emphatic counterparts. Some of these phonemes have coalesced in the various modern dialects, while new phonemes have been introduced through borrowing or phonemic splits. A "phonemic quality of length" applies to consonants as well as vowels.
History
Of the 29 Proto-Semitic consonants, only one has been lost:, which merged with, while became . Various other consonants have also changed sound, but have remained distinct. An original lenited to, and – consistently attested in pre-Islamic Greek transcription of Arabic languages – became palatalized to or by the time of the Quran, and to,, or after early Muslim conquests and in MSA.Its emphatic counterpart was considered by Arabs to be the most unusual sound in Arabic. For most modern dialects, it has become an emphatic stop with loss of the laterality or, with complete loss of any pharyngealization or velarization,. The classical pronunciation withpharyngealization still occurs in the Mehri language, and the similar sound without velarization,, exists in other Modern South Arabian languages.Other changes may also have happened; Classical Arabic pronunciation is not thoroughly recorded and different reconstructions of the sound system of Proto-Semitic propose different phonetic values. One example is the emphatic consonants, which are pharyngealized in modern pronunciations but may have been velarized in the eighth century and glottalized in Proto-Semitic.
Reduction of and between vowels occurs in a number of circumstances and is responsible for much of the complexity of third-weak verbs. Early Akkadian transcriptions of Arabic names show that this reduction had not yet occurred as of the early part of the 1st millennium BC.
The Classical Arabic language as recorded was a poetic koine that reflected a consciously archaizing dialect, chosen based on the tribes of the western part of the Arabian Peninsula, who spoke the most conservative variants of Arabic. Even at the time of Muhammad and before, other dialects existed with many more changes, including the loss of most glottal stops, the loss of case endings, the reduction of the diphthongs and into monophthongs, etc. Most of these changes are present in most or all modern varieties of Arabic.
An interesting feature of the writing system of the Quran is that it contains certain features of Muhammad's native dialect of Mecca, corrected through diacritics into the forms of standard Classical Arabic. Among these features visible under the corrections are the loss of the glottal stop and a differing development of the reduction of certain final sequences containing : Evidently, the final became as in the Classical language, but final became a different sound, possibly . This is the apparent source of the alif maqṣūrah 'restricted alif' where a final is reconstructed: a letter that would normally indicate or some similar high-vowel sound, but is taken in this context to be a logical variant of alif and represent the sound.
Historical development
Arabic phonology has evolved over centuries, influenced by language contact and historical expansion. Classical Arabic phonological features have shifted in modern dialects, partly due to the spread of Arabic through conquest and trade. These changes have resulted in both the preservation of classical features and significant innovations across dialects.Literary Arabic
The "colloquial" spoken dialects of Arabic are learned at home and constitute the native languages of Arabic speakers. "Formal" Modern Standard Arabic is learned at school; although many speakers have a native-like command of the language, it is technically not the native language of any speakers. Both varieties can be both written and spoken, although the colloquial varieties are rarely written down and the formal variety is spoken mostly in formal circumstances, e.g., in radio and TV broadcasts, formal lectures, parliamentary discussions and to some extent between speakers of different colloquial dialects.Even when the literary language is spoken, it is normally only spoken in its standard form when reading a prepared text out loud or communicating between speakers of different colloquial dialects. When speaking extemporaneously, speakers tend to deviate somewhat from the strict literary language in the direction of the colloquial varieties. There is a continuous range of "in-between" spoken varieties: from nearly orthodox Modern Standard Arabic, to a form that still uses MSA grammar and vocabulary but with colloquial influence, to a form of the colloquial language that imports a number of words and grammatical constructions in MSA, to a form that is close to pure colloquial but with the "rough edges" smoothed out, to forms that are purely colloquial.
The particular variant used depends on the social class and education level of the speakers involved and the level of formality of the speech situation. Often it will vary within a single encounter, e.g., moving from nearly pure MSA to a more mixed language in the process of a radio interview, as the interviewee becomes more comfortable with the interviewer. This type of variation is characteristic of the diglossia that exists throughout the Arabic-speaking world.
Although Modern Standard Arabic is a unitary language, its pronunciation varies somewhat from country to country and from region to region within a country. The variation in individual "accents" of MSA speakers tends to mirror corresponding variations in the colloquial speech of the speakers in question, but with the distinguishing characteristics moderated somewhat. It is important in descriptions of "Arabic" phonology to distinguish between pronunciation of a given colloquial dialect and the pronunciation of MSA by these same speakers.
Although they are related, they are not the same. For example, the phoneme that derives from Standard Arabic has many different pronunciations in the modern spoken varieties, e.g.,. Speakers whose native variety has either or will often use the same pronunciation when speaking MSA. Even speakers from Cairo, whose native Egyptian Arabic has, normally use when speaking MSA.
For another example, many colloquial varieties are known for a type of vowel harmony in which the presence of an "emphatic consonant" triggers backed allophones of nearby vowels. In many spoken varieties, the backed or "emphatic" vowel allophones spread a fair distance in both directions from the triggering consonant. In some varieties, most notably Egyptian Arabic, the "emphatic" allophones spread throughout the entire word, usually including prefixes and suffixes, even at a distance of several syllables from the triggering consonant.
Speakers of colloquial varieties with this vowel harmony tend to introduce it into their MSA pronunciation as well, but usually with a lesser degree of spreading than in the colloquial varieties. For example, speakers of colloquial varieties with extremely long-distance harmony may allow a moderate, but not extreme, amount of spreading of the harmonic allophones in their MSA speech, while speakers of colloquial varieties with moderate-distance harmony may only harmonize immediately adjacent vowels in MSA.
Vowels
Modern Standard Arabic has six vowel phonemes forming three pairs of corresponding short and long vowels. Many spoken varieties also include and. Modern Standard Arabic has two diphthongs. Allophony in different dialects of Arabic can occur and is partially conditioned by neighboring consonants within the same word. The following are some general rules:- * retracted to in the environment of a neighboring, or an emphatic consonant :,,,,, and in a few regional standard pronunciations also and ;
- * only in Iraq and the Persian Gulf: before a word boundary;
- * advanced to in the environment of most consonants:
- ** labial consonants,
- ** plain coronal consonants with the exception of : namely,,,,,,,, and
- ** glottal consonants
- **, and ;
- * Across North Africa and West Asia, the allophones and may be realized differently, either as, or both as ;
- * In northwestern Africa, the open front vowel is raised to or.
- * Across North Africa and West Asia, may be realized as before or adjacent to emphatic consonants and,,. can also have different realizations, i.e.. Sometimes with one value for each vowel in both short and long lengths or two different values for each short and long lengths. They can be distinct phonemes in loanwords for a number of speakers.
- * In Egypt, close vowels have different values; short initial or medial:, ← instead of. and completely become and respectively in some other particular dialects. Unstressed final long are most often shortened or reduced: →, →, → .
The final heavy syllable of a root is stressed.
The short vowels are all possible allophones of across different dialects; e.g., قُلْت is pronounced or or, since the difference between the short mid vowels and is never phonemic, and they are mostly found in complementary distribution, except for a number of speakers where they can be phonemic but only in foreign words.
The short vowels are all possible allophones of across different dialects; e.g., مِن is pronounced or or since the difference between the short mid vowels and is never phonemic, and they are mostly found in complementary distribution, except for a number of speakers where they can be phonemic but only in foreign words.
The long mid vowels and appear to be phonemic in most varieties of Arabic except in general Maghrebi Arabic, where they merge with and. For example, لون is generally pronounced in Mashriqi dialects but in most Maghrebi Arabic. The long mid vowels can be used in Modern Standard Arabic in dialectal words or in some stable loanwords or foreign names, as in روما and شيك .
Foreign words often have a liberal sprinkling of long vowels, as vowels tend to be written as long vowels in foreign loans, under the influence of European-language orthographies which write down every vowel with a letter. The long mid vowels and are always rendered with the letters ي and و, respectively, accompanied by a preceding hamzah sitting above and below an alif respectively word-initially. In general, the pronunciation of loanwords is highly dependent on the speaker's native variety.
Consonants
Even in the most formal contexts, pronunciation of Arabic depends on the speaker's background, even if the number and phonetic character of most of the 28 consonants has a broad degree of regularity among Arabic-speaking regions. Arabic is particularly rich in uvular, pharyngeal, and pharyngealized sounds.Note: the table and notes below discuss the phonology of Modern Standard Arabic among Arabic speakers and not regional dialects.
Long consonants are pronounced exactly like short consonants, but last longer. In Arabic, they are called mushaddadah. Between a long consonant and a pause, an epenthetic occurs, but this is only common across regions in West Asia.
Phonotactics
Standard Arabic syllables come in only five forms:- CV
- CVV
- CVC
- CVVC
- CVCC
Super-heavy syllables are usually not allowed except word finally, with the exception of CVV- before geminates creating non-final CVVC- syllables, these can be found in the active participles of geminate Form I verbs, like in, . In the pausal form, the final geminates behave as a single consonant, only when preceding another word or with vocalization, the geminates start appearing, belonging to two separate syllables. E.g.:,,,, and .
Loanwords can break some phonotactic rules like allowing initial consonant clusters like in استاد "stadium" and فلورنسا "Florence" or allowing CVVC syllables non-finally without geminates like in روسيا "Russia" and سوريا "Syria", which can be modified to to fit the phonotactics better.
Word stress
Stress in Modern Standard Arabic is generally consistent and standardized across the Arab world. Exceptions exist only in a limited set of scenarios, detailed below. Broadly speaking, stress is most likely to fall on the second-to-last syllable, but frequently occurs in the final and third-to-last as well.Arabic syllables can be categorized as light, heavy, and superheavy. This refers to the arrangement of vowels and consonants within the syllable.
With "C" representing a consonant, "V" representing a vowel, and "VV" representing a long vowel:
- Light:
- * A syllable containing a short vowel, such as وَ /wa/.
- Heavy:
- * A syllable containing a long vowel, such as لَا /laː/.
- * A syllable containing a short vowel followed by one consonant, such as مِن /min/.
- Super-heavy:
- * A syllable containing a long vowel followed by one consonant, such as بَاب /baːb/.
- * A syllable containing a short vowel followed by two consonants, such as بِنْت /bint/, or a long vowel followed by a geminate consonant, such as مادّ /maːdd/.
The following description of Arabic stress patterns is adapted from the description provided by Halpern.
- Disregard all attached prepositions. If this leaves only one syllable, stress it.
- If the final syllable is superheavy, stress it.
- Otherwise, if the word has only two syllables, stress the first one.
- Otherwise, if the second-to-last syllable is heavy, stress it.
Local variations of Modern Standard Arabic
Spoken varieties differ from Classical Arabic and Modern Standard Arabic not only in grammar but also in pronunciation. this variation might affect the way Modern Standard Arabic is spoken in each country or region.Some examples of variation:
Consonants
The standard pronunciation of ⟨ج⟩ in MSA varies regionally, most prominently in the Arabian Peninsula, parts of the Levant, Iraq, north-central Algeria, and southern Egypt, it is also considered as the predominant pronunciation of Literary Arabic outside the Arab world and the pronunciation mostly used in Arabic loanwords across other languages, and in Morocco, Tunisia, Libya, southern Algeria, most of the Levant, eastern Arabian Peninsula. Other pronunciations include in Egypt, in coastal Yemen, and Oman, as well as in Sudan and hinterland Yemen.In Modern Standard Arabic, is used as a marginal phoneme to pronounce some dialectal and loan words. On the other hand, it is considered a native phoneme or allophone in most modern Arabic dialects, mostly as a variant of ق or as a variant of ج. It is also considered a separate foreign phoneme that appears only in loanwords, as in most urban Levantine dialects where ق is and ج is.
The dental ض was historically, a value it retains among older speakers in a few isolated dialects.
Mergers and mispronunciations
Regional modern dialects may influence the way Modern Standard Arabic is spoken, which sometimes causes mergers or mispronunciations in consonants:
- Speakers that merge ض and ظ to ظ in their respective dialects usually mispronounce ض as ظ when speaking Modern Standard Arabic instead of the standard, e.g. ضار 'harmful' is pronounced instead of.
- The voiced emphatic dental fricative ظ is sometimes mispronounced as a voiced emphatic alveolar fricative depending on the speaker in Egypt, Sudan and Lebanon, e.g. محافظة 'governorate' is pronounced instead of.
- Speakers that lack the interdentals ث and ذ in their respective dialects, may merge them with ت and د or س and ز, respectively.
- Some speakers especially in Lebanon and Egypt might pronounce the standard uvular ق as a plain velar ك.
- A number of speakers in Yemen and among Bedu pronounce the uvular ق as a velar when speaking Modern Standard Arabic, e.g. لقد قلت لهم 'I told them' is pronounced instead of.
The foreign phonemes,,, etc. are not necessarily pronounced by all Arabic speakers, but they can be pronounced by some speakers especially in foreign proper nouns and loanwords. and are usually transcribed with their own letters ﭖ and ﭪ but as these letters are not part of Standard Arabic, and they are simply written with ب and ف, e.g. The use of both sounds may be considered marginal and Arabs may pronounce the words interchangeably; both نوفمبر and نوڤمبر, or "November", both كاپريس and كابريس "caprice" can be used.
is a possible loanword phoneme, as in the word or, though a number of varieties instead break up the and sounds with an epenthetic vowel. Egyptian Arabic treats as two consonants and inserts, as or, when it occurs before or after another consonant. is found as normal in Iraqi Arabic and Gulf Arabic dialects. Normally the combination تش is used to transliterate the. e.g. تشاد "Chad".
Vowels
- Development of highly distinctive allophones of and, with highly fronted, or in non-emphatic contexts, and retracted in emphatic contexts. The more extreme distinctions are characteristic of sedentary varieties, while Bedouin and conservative Arabian-peninsula varieties have much closer allophones. In some of the sedentary varieties, the allophones are gradually splitting into new phonemes under the influence of loanwords, where the allophone closest in sound to the source-language vowel often appears regardless of the presence or absence of nearby emphatic consonants.
- Spread of "emphasis", visible in the backing of phonemic. In conservative varieties of the Arabic peninsula, only adjacent to emphatic consonants is affected, while in Cairo, an emphatic consonant anywhere in a word tends to trigger emphatic allophones throughout the entire word. Dialects of the Levant are somewhere in between. Moroccan Arabic is unusual in that and have clear emphatic allophones as well.
- The diphthongs and have monophthongized into and in most the Mashreqi dialects, these mid vowels may also be present in loanwords when speaking MSA, such as ملبورن, سكرتير, روما and دكتور.
- Loss of final short vowels, and shortening of final long vowels. This triggered the loss of most Classical Arabic case and mood distinctions.
- Change of nisba suffix ' > ', the nisba suffix ' as عَرَبِيّ is usually mispronounced ' by many speakers.
- Shorten of final long ' > ' حُبِّي is usually pronounced with a short final by many speakers.
Distribution
The most frequent consonant phoneme is, the rarest is. The frequency distribution of the 28 consonant phonemes, based on the 2,967 triliteral roots listed by Wehr is :| Phoneme | Frequency | Phoneme | Frequency | |
| 24% | 18% | |||
| 17% | 17% | |||
| 17% | 16% | |||
| 14% | 13% | |||
| 13% | 13% | |||
| 13% | 12% | |||
| 12% | 11% | |||
| 10% | 9% | |||
| 8% | 8% | |||
| 8% | 8% | |||
| 7% | 7% | |||
| 6% | 5% | |||
| 5% | 3% | |||
| 3% | 1% |
This distribution does not necessarily reflect the actual frequency of occurrence of the phonemes in speech, since pronouns, prepositions and suffixes are not taken into account, and the roots themselves will occur with varying frequency. In particular, occurs in several extremely common affixes despite being fifth from last on Wehr's list. The list does give, however, an idea of which phonemes are more marginal than others. Note that the five least frequent letters are among the six letters added to those inherited from the Phoenician alphabet, namely,,,,, and.