Unclassified language
An unclassified language is a language whose genetic affiliation to other languages has not been established. Languages can be unclassified for a variety of reasons, mostly due to a lack of reliable data but sometimes due to the confounding influence of language contact, if different layers of its vocabulary or morphology point in different directions and it is not clear which represents the ancestral form of the language. Some poorly known extinct languages, such as Gutian, are simply unclassifiable, and it is unlikely the situation will ever change.
A supposedly unclassified language may turn out not to be a language at all, or even a distinct dialect, but merely a family, tribal or village name, or an alternative name for a people or language that is classified.
If a language's genetic relationship has not been established after significant documentation of the language and comparison with other languages and families, as in the case of Basque in Europe, it is considered a language isolate – that is, it is classified as a language family of its own. An 'unclassified' language therefore is one which may still turn out to belong to an established family once better data is available or more thorough comparative research is done. Extinct unclassified languages for which little evidence has been preserved are likely to remain in limbo indefinitely, unless lost documents or a surviving speaking population are discovered.
Classification challenges
An example of a language that has caused multiple problems for classification is Mimi of Decorse in Chad. This language is only attested in a single list of words collected ca. 1900. At first it was thought to be a Maban language, because of similarities to Maba, the first Maban language to be described. However, as other languages of the Maban family were described, it became clear that the similarities were solely with Maba itself, and the relationship was too distant for Mimi to be related specifically to Maba and not equally to the other Maban languages. The obvious similarities are therefore now thought to be due to borrowings from Maba, which is the socially dominant language in the area. When such loans are discounted, there is much less data to classify Mimi with, and what does remain is not particularly similar to any other language or language family. Mimi might therefore be a language isolate, or perhaps a member of some other family related to Maban in the proposed but as yet undemonstrated Nilo-Saharan phylum. It would be easier to address the problem with better data, but no-one has been able to find speakers of the language again.It also happens that a language may be unclassified within an established family. That is, it may be obvious that it is, say, a Malayo-Polynesian language, but not clear in which branch of Malayo-Polynesian it belongs. When a family consists of many similar languages with great degree of confusing contact, a large number of languages may be effectively unclassified in this manner. Families where this is a substantial problem include Malayo-Polynesian, Bantu, Pama–Nyungan, and Arawakan.
Examples by reason
There are hundreds of unclassified languages, most of them extinct, although there are some, albeit relatively few, that are still spoken; in the following list, the extinct languages are labeled with a dagger.Absence of data
Some languages are unclassifiable, not just unclassified, because while there may be record of a language existing there may not be enough materials in it to analyze and classify, especially with now-extinct languages. Unclassifiable languages with an absence of data include:- Sentinelese – a living presumed language of an uncontacted people; assumed to be Ongan
- Weyto – speculated to have been Agaw
- Nam – data remains undeciphered; probably Sino-Tibetan
- Harappan † – data remains undeciphered
- Cypro-Minoan † – data remains undeciphered
- Lullubi
- Guale –Yamasee
- Himarimã – a living presumed language of an uncontacted people; assumed to be Arawan
- Nagarchal – assumed to have been Dravidian
- Kwisi
- Ancient Cappadocian – possibly Anatolian
- Lycaonian – possibly Anatolian
- Zapotec
- Otomi
- Moksela – possibly has been one of the Central Maluku languages
- Gomba
- Palumata – perhaps a dialect of the Hukumina language
- Giyug – possibly Wagaydyic
- Gujambal
- Karranga – likely Pama–Nyungan
- Yugul – likely Marran
- Aguano – may be Arawakan
- Alagüilac – may be related to Xinca
- Avoyel
- Balomar – likely a dialect of Charrúa language
- Flecheiro – assumed to be Katukinan
- Janambre
- Jumanos
- Majena
- Moneton – likely Siouan
- Opelousa
- Pedee – possibly Siouan
- Tremembé
- Truká
- Wakoná
- Wasu
Scarcity of data
- Solano – possibly a language isolate
- Cacán
- Kujargé – possibly Afroasiatic
- Bung – most likely Niger–Congo
- Luo
- Komta
- Wawu
- Kambojan
- Dima-Bottego
- Philistine – perhaps either Afroasiatic or Indo-European
- Iberian
- Minoan
- Eteocretan
- Hattic – probably a language isolate
- Kaskian – possibly related to Hattic
- Kassite – possibly Hurro-Urartian
- Gutian
- Hunnic
- Xiongnu – possibly Para-Yeniseian or an isolate
- Tuoba – possibly Para-Mongolic or an isolate
- Rouran – possibly Para-Mongolic or an isolate
- Beothuk – assumed to have been related to Algonquian languages
- Meroitic – possibly Nilo-Saharan or Afroasiatic
- Guanahatabey – presumed to have been related to Warao
- Macorix – presumed to have been related to Warao
- Pankararú – likely a language isolate
- Ramanos
- Tartessian
- Ligurian – probably Indo-European
- Rutulian
- Elymian – likely Indo-European
- Sicanian
- Eteocypriot
- Tambora – possibly a language isolate
- Karami
- Makolkol
Unrelated to nearby languages and not commonly examined
- Bangime
- Jalaa
- Kwaza
- Xocó – not clear if it was a single language
- Mpre
Basic vocabulary unrelated to other languages
- Bayot
- Laal
Not closely related to other languages and no academic consensus
- Ongota
- Shabo
- Omaio
- Kenaboi
Languages of dubious existence
- Oropom
- Imeraguen
- Nemadi
- Rer Bare
- Wutana
- Trojan
- North Picene
- Quimbaya