Substrate in Romanian
The Romanian language evolved from Vulgar Latin, stemming from the influence of the Roman Empire. Linguists believe it also contains remnants of native languages which existed before that time.
The proposed substratal elements in Romanian are mostly lexical items. The process of determining if a word is from the substratum involves comparison to Latin, languages with which Romanian came into contact, or determining if it is an internal construct. If there are no matching results, a comparison to Albanian vocabulary, Thracian remnants or Proto-Indo-European reconstructed words is made.
In addition to vocabulary, some other features of Eastern Romance, such as phonological features and elements of grammar may also be from Paleo-Balkan languages.
Romanian developed from the Common Romanian language, which in turn developed from Vulgar Latin. According to a widely accepted theory, the territory where the language formed was a large one, consisting of both the north and the south of the Danube, more precisely to the north of the Jireček Line. Other scholars place the origin of the Romanian language in the Balkan Peninsula, strictly south of the Danube. The Cambridge History of the Romance Languages, published in 2013, came to the conclusion that the "historical, archaeological and linguistic data available do not seem adequate" to determine the territory where the development of the Romanian language began.
Lexical items
The study of the substrate involves comparative methods applied to:- Albanian and its reconstructed ancient precursor – Proto-Albanian – an Indo-European language and the only surviving representative of the Albanoid branch, belonging to the Paleo-Balkan group of antiquity. Albanian varieties are today spoken by approximately 6 million people in the Balkans, primarily in Albania, Kosovo, North Macedonia, Serbia, Montenegro and Greece. Albanian, especially the Tosk dialect, also represents one of the core languages of the Balkan Sprachbund.
- Thraco-Dacian or Thracian, a language that although almost unattested has left traces in toponomy and inscriptions.
- Proto-Indo-European, if none of the other languages yielded any results.
Comparative methods applied to Albanian
In general, words assumed to belong to substratum can be placed into two categories:those related to nature and the natural world
- terrain: ciucă, groapă, mal, măgură, noian;
- bodies of water: bâlc, pârâu;
- flora: brusture, bung, ciump, coacăză, copac, curpen, druete, leurdă, ghimpe, mazăre, mărar, mugure, sâmbure, spânz, strugure, țeapă;
- fauna: balaur, bală, baligă, barză, brad, călbează, căpușă, cioară, cioc, ciut, ghionoaie, măgar, mânz, murg, mușcoi, năpârcă, pupăză, rață, strepede, șopârlă, știră, țap, viezure, vizuină;
- food: abur, brânză, fărâmă, grunz, sarbăd, scrum, urdă, zară;
- clothing: bască, brâu, căciulă, sarică;
- housing: argea, cătun, gard;
- body : buză, ceafă, ciuf, grumaz, gușă, rânză;
- related activities: baci, bâr, buc, grapă, gresie, lete, strungă, țarc, zgardă.
Words possibly of substratum but not generally agreed among linguists are: arichiță, băiat, băl, brâncă, borț, bulz, burduf, burtă, codru, Crăciun, creț, cruța, curma, daltă, dărâma, fluture, lai, mătură, mire, negură, păstaie, scorbură, spuză, stăpân, sterp, stână, traistă.
Comparative methods applied to Thraco-Dacian and/or other Indo-European languages
The comparative method can be extended to other languages of the Indo-European family, including ones from which Romanian could not have borrowed directly or indirectly, in order to reconstruct Thraco-Dacian substratum words. This yields results with varying degrees of probability. Between 80 and 100 words belong to this category.Substratum words like mal have almost identical correspondents in Albanian mal, but they can also be related to toponyms like Dacia Maluensis later renamed by Romans to Dacia Ripensis.
All river names over 500 km and half of those between 200 and 500 km derive from pre-Latin substratum, according to linguist and philologist Oliviu Felecan. Similarly, linguist Grigore Brâncuș states that almost the entire major hydronymy has been transmitted from Dacian to Romanian. Other linguists have pointed out that the present Romanian forms of these hydronyms indicate that they were borrowed from Slavs or Hungarians.
Image:Rivers Romania.png|thumb|350px|Major rivers of Romania. According to one theory, Romanian has preserved the substrate form of their names instead of the Latin form. Other linguist say that the Romanian form of the names of these rivers indicate, that they are loanwords in Romanian mainly from Slavic and Hungarian.
| Name in Romanian | Proposed etymon | Language of the etymon |
| Dunăre | Donaris | Thracian |
| Mureș | morisjo | Dacian |
| Olt | *ol- | Proto-Indo-European |
| Prut | *pltus | Proto-Indo-European |
| Siret | *ser- | Proto-Indo-European |
| Tisa | Tibisio | Dacian |
| Argeș | *arg- | Thracian |
| Buzău | *bhuǧ- | Thracian |
| Crișul | kres- | Thracian |
| Jiu | Gilpil | Dacian |
| Someș | çam- | Sanskrit |
| Timiș | *ti- | Proto-Indo-European |
| Ampoi | Ampee | Daco-moesian |
| Bârzava | berzava | Thracian |
| Gilort | sil-arta | Dacian |
| Ibru | *eybhro | Proto-Indo-European |
| Vedea | *ued- | Proto-Indo-European |
| Nera | *ner- | Proto-Indo-European |
| Năruia | *ner- | Dacian |
| Săsar | *ser- | Proto-Indo-European |
| Strei | *sreu | Proto-Indo-European |
Phonetic, morphological and syntactic features
A couple of phonetic changes have been agreed on as substratum influence:- the fricative post-alveolar consonant ș - /ʃ/ - comes from the voiceless fricative s in a soft position for example Lat. serpens> Rom. șarpe.
- rhotacism of n consonant, seen only marginally in Romanian, is a general rule for lexical items of Istro-Romanian and Tosk Albanian prior to the contact with Slavic languages.
Likewise, the morphological and syntactical features attributed to substratum, identified by comparison to Albanian and other languages of the Balkan sprachbund, are subject to scholarly debate since the grammatical structure of the ancient languages of the Balkans, except Greek, is unattested.
A difficult research topic
Numerous language studies and research papers discuss the problems of the Substrate in Romanian, considered by some to be the most controversial and difficult part of Romanian language since its nature and development could explain the evolution of Latin to Romanian.Some linguists propose that a number of words presented as borrowings from a Slavic language or from Hungarian in standard literature may have actually developed from reconstructed words of local Indo-European languages and they were borrowed from Romanian by the neighboring languages. Though the substratum status of many Romanian words is not much disputed, their status as Dacian words is controversial, some more than others since there are no significant surviving written examples of the Dacian language. Many of the possible pre-Roman lexical items of Romanian have Albanian parallels, and if they are in fact substratum words cognates with the Albanian ones, and not loanwords from Albanian, it indicates that the substrate language of Romanian may have been on the same Indo-European branch as Albanian.
Other languages
The Bulgarian Thracologist Vladimir Georgiev developed the theory that the Romanian language has a "Daco-Moesian" language as its substrate, a hypothesised language that according to him had a number of features which distinguished it from the Thracian language spoken further south, across the Haemus range.There are also some Romanian substratum words in languages other than Romanian, these examples having entered via Romanian dialects. For example, Bryndza is a type of cheese made in Eastern Austria, Poland, the Czech Republic, Slovakia and Ukraine, the name being derived from the Romanian word for cheese.