Windows-1256


Windows-1256 is a code page used under Microsoft Windows to write Arabic and other languages that use Arabic script, such as Persian and Urdu.
This code page is neither compatible with ISO/IEC 8859-6 nor the MacArabic encoding.
Windows-1256 encodes every abstract single letter of the basic Arabic alphabet, not every concrete visual form of isolated, initial, medial, final or ligatured letter shape variants. The Arabic letters in the C0-FF range are in Arabic alphabetic order, but some Latin characters are interspersed among them. These are some Windows-1252 Latin characters used for French, since this European language has some historic relevance in former French colonies in North Africa such as Morocco and Algeria. This allowed French and Arabic text to be intermixed when using Windows-1256 without any need for code-page switching .
IBM uses code page 1256 for Windows-1256.
Unicode is preferred over Windows-1256 in modern applications, especially on the Internet, where the dominant UTF-8 encoding is most used for web pages, including for Arabic. Less than 0.03% of all web pages use Windows-1256 in October 2022, and while that encoding is mostly used for Arabic, and second-most popular for it, it is only used for 1.6% of the Arabic text on the web.

Character set

Since the original code page left 9 byte values marked as "NOT USED" in the original specification, these bytes were used later for the euro sign, and for additional letters in the Perso-Arabic script.
The following table shows the extended version of Windows-1256. Each character is shown with its Unicode equivalent and its decimal code.
Here every Arabic letter is shown in isolated form. The actual forms of the letters inside Arabic words are rendered by a combination of software rules and appropriate font support.