ISO 639 macrolanguage


A macrolanguage is a group of mutually intelligible speech varieties, or dialect continuum, that have no traditional name in common, and which may be considered distinct languages by their speakers. Macrolanguages are used as a book-keeping mechanism for the ISO 639 international standard of language codes. Macrolanguages are established to assist mapping between different sets of ISO language codes. Specifically, there may be a many-to-one correspondence between ISO 639-3, intended to identify all the thousands of languages of the world, and either of two other sets, ISO 639-1, established to identify languages in computer systems, and ISO 639-2, which encodes a few hundred languages for library cataloguing and bibliographic purposes. When such many-to-one ISO 639-2 codes are included in an ISO 639-3 context, they are called "macrolanguages" to distinguish them from the corresponding individual languages of ISO 639-3. According to the ISO,
ISO 639-3 is curated by SIL International; ISO 639-2 is curated by the Library of Congress.
The mapping often has the implication that it covers borderline cases where two language varieties may be considered strongly divergent dialects of the same language or very closely related languages ; it may also encompass situations when there are language varieties that are considered to be varieties of the same language on the grounds of ethnic, cultural, and political considerations, rather than linguistic reasons. However, this is not its primary function and the classification is not evenly applied.
For example, Chinese is a macrolanguage encompassing many languages that are not mutually intelligible, but the languages "Standard German", "Bavarian German", and other closely related languages do not form a macrolanguage, despite being more mutually intelligible. Other examples include Tajiki not being part of the Persian macrolanguage despite sharing much lexicon, and Urdu and Hindi not forming a macrolanguage despite forming a mutually intelligible dialect continuum. All dialects of Hindi are considered separate languages. Basically, ISO 639-2 and ISO 639-3 use different criteria for dividing language varieties into languages, 639-2 uses shared writing systems and literature more whereas 639-3 focuses on mutual intelligibility and shared lexicon. The macrolanguages exist within the ISO 639-3 code set to make mapping between the two sets easier.
The use of macrolanguages was applied in Ethnologue, starting in the 16th edition. The most recent registered macrolanguage is Sanskrit with code san, adopted in 15 December 2023, though it already existed as individual language for several years.
, there are fifty-nine language codes in ISO 639-2 that are counted as macrolanguages in ISO 639-3. Some of the macrolanguages had no individual language in ISO 639-2, e.g. "ara", but ISO 639-3 recognizes different varieties of Arabic as separate languages under some circumstances. Others, like "nor" had their two individual parts already in 639-2. That means some languages that were considered by ISO 639-2 to be dialects of one language are now in ISO 639-3 in certain contexts considered to be individual languages themselves. This is an attempt to deal with varieties that may be linguistically distinct from each other, but are treated by their speakers as forms of the same language, e.g. in cases of diglossia. For example,
ISO 639-2 also includes codes for collections of languages; these are not the same as macrolanguages. These collections of languages are excluded from ISO 639-3, because they never refer to individual languages. Most such codes are included in ISO 639-5.

Types of macrolanguages

  • elements that have no ISO 639-2 code: 4
  • elements that have no ISO 639-1 code: 29
  • elements that do have ISO 639-1 codes: 34
  • elements whose individual languages have ISO 639-1 codes: 4
  • * akatw
  • * hbsbs, hr, sr
  • * msaid
  • * nornb, nn

    List of macrolanguages

This list only includes official data from SIL International.
ISO 639-1ISO 639-2ISO 639-3Number of individual languagesName of macrolanguage
akakaaka2Akan language
araraara28 + retired 2Arabic language
ayaymaym2Aymara language
azazeaze2Azerbaijani language
balbal3Baluchi language
bikbik8 + retired 1Bikol language
bnc5Bontok language
buabua3Buriat language
chmchm2Mari language
crcrecre6Cree language
deldel2Delaware language
denden2Slavey language
dindin5Dinka language
doidoi2Dogri language
etestest2Estonian language
fafas/perfas2Persian language
fffulful9Fulah language
gbagba6 + retired 1Gbaya language
gongon3 + retired 1Gondi language
grbgrb5Grebo language
gngrngrn5Guaraní language
haihai2Haida language
hbs4Serbo-Croatian
hmnhmn25 + retired 1Hmong language
iuikuiku2Inuktitut language
ikipkipk2Inupiaq language
jrbjrb4 + retired 1Judeo-Arabic languages
krkaukau3Kanuri language
kln9Kalenjin languages
kokkok2Konkani language
kvkomkom2Komi language
kgkonkon3Kongo language
kpekpe2Kpelle language
kukurkur3Kurdish language
lahlah7 + retired 1Lahnda language
lvlavlav2Latvian language
luy14Luyia language
manman6 + retired 1Manding languages
mgmlgmlg11 + retired 1Malagasy language
mnmonmon2Mongolian language
msmsa/maymsa36 + retired 1Malay language
mwrmwr6Marwari language
nenepnep2Nepali language
nonornor2Norwegian language
ojojioji7Ojibwa language
ororiori2Oriya language
omormorm4Oromo language
pspuspus3Pashto language
ququeque43 + retired 1Quechua language
rajraj6Rajasthani language
romrom7Romany language
sasansan2Sanskrit language
sqsqi/albsqi4Albanian language
scsrdsrd4Sardinian language
swswaswa2Swahili language
syrsyr2Syriac language
tmhtmh4Tuareg languages
uzuzbuzb2Uzbek language
yiyidyid2Yiddish language
zapzap58 + retired 1Zapotec language
zazhazha16 + retired 2Zhuang languages
zhzho/chizho19Chinese language
zzazza2Zaza language
345963444 + retired 15total codes
ISO 639-1ISO 639-2ISO 639-3Number of individual languagesName of macrolanguage

ISO
639-1
Code
ISO
639-2
Code
English
name of
Language
French
name of
Language
Date
Added or
Changed
Category
of Change
Notes
Serbo-Croatianserbo-croate2000-02-18DepThis code was deprecated in 2000 because there were separate language codes for each individual language represented. It was published in a revision of ISO 639-1, but was never included in ISO 639-2. It is considered a macrolanguage in ISO 639-3. Its deprecated status was reaffirmed by the ISO 639 JAC in 2005.
srsrp Serbianserbe2008-06-28CCISO 639-2/B code deprecated in favor of ISO 639-2/T code
hrhrv Croatiancroate2008-06-28CCISO 639-2/B code deprecated in favor of ISO 639-2/T code