Urdu alphabet


The Urdu alphabet is the right-to-left alphabet used for writing Urdu. It is a modification of the Persian alphabet, which itself is derived from the Arabic script. It has official and national status in Pakistan, and is official in certain regions in India. The Urdu alphabet has up to 39 or 40 distinct letters with no distinct letter cases and is typically written in the calligraphic Nastaʿlīq script, whereas Arabic is more commonly written in the Naskh style.
Usually, bare transliterations of Urdu into the Latin alphabet omit many phonemic elements that have no equivalent in English or other languages commonly written in the Latin script.

History

The standard Urdu script is a modified version of the Perso-Arabic script and has its origins in the 13th century Iran. It is also related to Shahmukhi, used for the Punjabi language varieties in Punjab, Pakistan. It is closely related to the development of the Nastaliq style of Perso-Arabic script. During the Mughal era, Nasta'liq became the common script for writing the Hindustani language, especially Urdu.
Despite the invention of the Urdu typewriter in 1911, Urdu newspapers continued to publish prints of handwritten scripts by calligraphers known as katibs or khush-navees until the late 1980s. The Pakistani national newspaper Daily Jang was the first Urdu newspaper to use Nastaʿlīq computer-based composition. There are efforts under way to develop more sophisticated and user-friendly Urdu support on computers and the internet. Nowadays, nearly all Urdu newspapers, magazines, journals, and periodicals are composed on computers with Urdu software programs.
Other than the Indian subcontinent, the Urdu script is also used by Pakistan's large diaspora, including in the United Kingdom, the United Arab Emirates, the United States, Canada, Saudi Arabia and other places.

Nastaliq

Urdu is written in the Nastaliq style. The Nastaliq calligraphic writing style began as a Persian mixture of the Naskh and Ta'liq scripts. After the Muslim conquest of the Indian subcontinent, Nastaliq became the preferred writing style for Urdu. It is the dominant style in Pakistan and many Urdu writers elsewhere in the world use it. Nastaʿlīq is more cursive and flowing than its Naskh counterpart.
In the Arabic alphabet, and many others derived from it, letters are regarded as having two or three general forms each, based on their position in the word. But the Nastaliq style in which Urdu is written uses more than three general forms for many letters, even in simple non-decorative documents.

Alphabet

The Urdu script is an abjad script derived from the modern Persian script, which is itself a derivative of the Arabic script. As an abjad, the Urdu script only shows consonants and long vowels; short vowels can only be inferred by the consonants' relation to each other. While this type of script is convenient in Semitic languages like Arabic and Hebrew, whose consonant roots are the key of the sentence, Urdu is an Indo-European language, which requires more precision in vowel sound pronunciation, hence necessitating more memorisation. The number of letters in the Urdu alphabet is somewhat ambiguous and debated.

Letter names and phonemes

'''Footnotes:'''

Additional characters and variations

Arabic Tāʼ marbūṭah

Tāʼ marbūṭah is also sometimes considered the 40th letter of the Urdu alphabet, though it is rarely used except for in certain loan words from Arabic. Tāʼ marbūṭah is regarded as a form of tā, the Arabic version of Urdu tē, but it is not pronounced as such, and when replaced with an Urdu letter in naturalised loan words it is usually replaced with Gol hē.

Table

Footnotes:

Hamza in Nastaliq

Hamza can be difficult to recognise in Urdu handwriting and fonts designed to replicate it, closely resembling two dots above as featured in Té and Qaf, whereas in Arabic and Geometric fonts it is more distinct and closely resembles the western form of the numeral 2.

Digraphs

DigraphTranscriptionIPAExamples
bh
ph
th
ṭh
jh
chh
dh
ḍh
rh
ṛh
kh
gh
lh
mh
nh

A separate do-chashmi-he letter,, exists to denote a /ʰ/ or a /ʱ/. This letter is mainly used as part of the multitude of digraphs, detailed in above.

Differences from the Persian alphabet

Urdu has more letters added to the Perso-Arabic base to represent sounds not present in Persian, which already has additional letters added to the Arabic base itself to represent sounds not present in Arabic. The letters added are shown in the table below:
LetterIPA
/ʈ/
/ɖ/
/ɽ/
/◌̃/
/ɛ:/ or /e:/.

Retroflex letters

used four dots ٿ ڐ ڙ over three Arabic letters ت د ر to represent retroflex consonants. In handwriting those dots were often written as a small vertical line attached to a small triangle. Subsequently, this shape became identical to a small letter t̤oʼē. It is commonly and erroneously assumed that ṭāʾ itself was used to indicate retroflex consonants because of it being an emphatic alveolar consonant that Arabic scribes thought approximated the Hindustani retroflexes. In modern Urdu, called to'e is always pronounced as a dental, not a retroflex.

Vowels

The Urdu language has ten vowels and ten nasalized vowels. Each vowel has four forms depending on its position: initial, middle, final and isolated. Like in its parent Arabic alphabet, Urdu vowels are represented using a combination of digraphs and diacritics. Alif, Waw, Ye, He and their variants are used to represent vowels.

Vowel chart

Urdu does not have standalone vowel letters. Short vowels are represented by optional diacritics upon the preceding consonant or a placeholder consonant if the syllable begins with the vowel, and long vowels by consonants alif, ain, ye, and wa'o as matres lectionis, with disambiguating diacritics, some of which are optional, whereas some are not. Urdu does not have short vowels at the end of words. This is a table of Urdu vowels:

''Alif''

Alif is the first letter of the Urdu alphabet, and it is used exclusively as a vowel. At the beginning of a word, alif can be used to represent any of the short vowels: ab, ism, Urdū. For long ā at the beginning of words alif-mad is used: āp, but a plain alif in the middle and at the end: bhāgnā.

''Wāʾo''

Wāʾo is used to render the vowels "ū", "o", "u" and "ō", and it is also used to render the labiodental approximant, . Only when preceded by the consonant k͟hē, can wāʾo render the "u" sound, or not pronounced at all. This is known as the silent wāʾo, and is only present in words loaned from Persian. When written with pesh, it is usually pronounced with "u" and "ū", for example "umeed" and "khushbū" . In the case of wāʾo being written with an ulta pesh, it would be pronounced with an "o" and "ō", such as the likes of "mohtāj" and "jāgō" ''

''Ye''

Ye is divided into two variants: choṭī ye and baṛī ye.
Choṭī ye is written in all forms exactly as in Persian. It is used for the long vowel "ī" and the consonant "y".
Baṛī ye is used to render the vowels "e" and "ai". Baṛī ye is distinguishable in writing from choṭī ye only when it comes at the end of a word/ligature. Additionally, Baṛī ye is never used to begin a word/ligature, unlike choṭī ye.
Letter's nameFinal FormMiddle FormInitial FormIsolated Form

Choṭī ye

Baṛī ye

''The 2 he's''

He is divided into two variants: gol he and do-cašmi he.
Gol he is written round and zigzagged, and can impart the "h" sound anywhere in a word. Additionally, at the end of a word, it can be used to render the long "a" or the "e" vowels, which also alters its form slightly.
Do-cašmi he is written as in Arabic Naskh style script, in order to create the aspirate consonants and write Arabic words.

''Ayn''

Ayn in its initial and final position is silent in pronunciation and is replaced by the sound of its preceding or succeeding vowel.

''Nun Ghunnah''

is represented by nun ghunna written after their non-nasalized versions, for example: ' when nasalized would become '. In middle form nun ghunna is written just like nun and is differentiated by a diacritic called maghnoona or ulta jazm which is a superscript V symbol above the.
Examples:
FormUrduTranscription
Orthography
End form
Middle form