Haplogroup I (mtDNA)


Haplogroup I is a human mitochondrial DNA haplogroup. It is believed to have originated about 21,000 years ago, during the Last Glacial Maximum. The haplogroup is unusual in that it is now widely distributed geographically, but is common in only a few small areas of East Africa, West Asia and Europe. It is especially common among the El Molo and Rendille peoples of Kenya, various regions of Iran, the Lemko people of Slovakia, Poland and Ukraine, the island of Krk in Croatia, the department of Finistère in France and some parts of Scotland and Ireland.

Origin

Haplogroup I is a descendant of haplogroup N1a1b and sibling of haplogroup N1a1b1. It is believed to have arisen somewhere in West Asia between 17,263 and 24,451 years before present, with coalescence age of 20.1 thousand years ago. Specifically, it has been suggested that its origin is in the Near East. It has diverged to at least seven distinct clades i.e. branches I1–I7, dated between 16–6.8 thousand years. The hypothesis about its Near Eastern origin is based on the fact that all haplogroup I clades, especially those from Late Glacial period, include mitogenomes from the Near East. The age estimates and dispersal of some subclades are similar to those of major subclades of the mtDNA haplogroups J and T, indicating possible dispersal of the I haplogroup into Europe during the Late Glacial period and postglacial period, several millennia before the European Neolithic period. Some subclades show signs of the Neolithic diffusion of agriculture and pastoralism within Europe.
A similar view puts more emphasis on the Persian Gulf region of the Near East.

Distribution

Haplogroup I is found at moderate to low frequencies in East Africa, Europe, West Asia and South Asia. In addition to the confirmed seven clades, the rare basal/paraphyletic clade I* has been observed in three individuals; two from Somalia and one from Iran.

Africa

The highest frequencies of mitochondrial haplogroup I observed so far appear in the Cushitic-speaking El Molo and Rendille in northern Kenya. The clade is also found at comparable frequencies among Soqotrans.
PopulationLocationLanguage FamilyNFrequencySource
AmharaEthiopiaAfro-Asiatic > Semitic1/1200.83%
EgyptiansEgyptAfro-Asiatic > Semitic2/345.9%
Beta IsraelEthiopiaAfro-Asiatic > Cushitic0/290.00%
Dawro KontaEthiopiaAfro-Asiatic > Omotic0/1370.00% and
EthiopiaEthiopiaUndetermined0/770.00%
Ethiopian JewsEthiopiaAfro-Asiatic > Cushitic0/410.00%
GurageEthiopiaAfro-Asiatic > Semitic1/214.76%
HamerEthiopiaAfro-Asiatic > Omotic0/110.00% and
OngotaEthiopiaAfro-Asiatic > Cushitic0/190.00% and
OromoEthiopiaAfro-Asiatic > Cushitic0/330.00%
TigraiEthiopiaAfro-Asiatic > Semitic0/440.00%
DaasanachKenyaAfro-Asiatic > Cushitic0/490.00%
ElmoloKenyaAfro-Asiatic > Cushitic12/5223.08% and
LuoKenyaNilo-Saharan0/490.00% and
MaasaiKenyaNilo-Saharan0/810.00% and
NairobiKenyaNiger-Congo0/1000.00%
NyangatomKenyaNilo-Saharan1/1120.89%
RendilleKenyaAfro-Asiatic > Cushitic3/1717.65% and
SamburuKenyaNilo-Saharan3/358.57% and
TurkanaKenyaNilo-Saharan0/510.00% and
HutuRwandaNiger-Congo0/420.00%
DinkaSudanNilo-Saharan0/460.00%
SudanSudanUndetermined0/1020.00%
BurungeTanzaniaAfro-Asiatic > Cushitic1/382.63%
DatogaTanzaniaNilo-Saharan0/570.00% and
IraqwTanzaniaAfro-Asiatic > Cushitic0/120.00%
SukumaTanzaniaNiger-Congo0/320.00% and
TuruTanzaniaNiger-Congo0/290.00%
YemeniYemenAfro-Asiatic > Semitic0/1140.00%

Asia

Haplogroup I is present across West Asia and Central Asia, and is also found at trace frequencies in South Asia. Its highest frequency area is perhaps in northern Iran. Terreros 2011 notes that it also has high diversity there and reiterates past studies that have suggested that this may be its place of origin. Found in Svan population from Georgia I* 4.2%."Sequence polymorphisms of the mtDNA control region in a human isolate: the Georgians from Swanetia."Alfonso-Sánchez MA1, Martínez-Bouzas C, Castro A, Peña JA, Fernández-Fernández I, Herrera RJ, de Pancorbo MM. The table below shows some of the populations where it has been detected.
PopulationLanguage FamilyNFrequencySource
BaluchIndo-European0/390.00%
BrahuiDravidian0/380.00%
Caucasus *Kartvelian1/581.80%
Druze11/3113.54%
GilakiIndo-European0/370.00%
GujaratiIndo-European0/340.00%
HazaraIndo-European0/230.00%
Hunza BurushoIsolate2/444.50%
India8/25440.30%
Iran 3/319.70%
Iran 2/1171.70%
KalashIndo-European0/440.00%
Kurdish Indo-European1/205.00%
Kurdish Indo-European1/323.10%
LurIndo-European0/170.00%
MakraniIndo-European0/330.00%
MazandarianIndo-European1/214.80%
PakistaniIndo-European0/1000.00%
Pakistan1/1450.69%
ParsiIndo-European0/440.00%
PathanIndo-European1/442.30%
PersianIndo-European1/422.40%
ShugnanIndo-European1/442.30%
SindhiIndo-European1/238.70%
Turkish Turkic2/405.00%
Turkish *Turkic1/502.00%
TurkmenTurkic0/410.00%
UzbekTurkic0/420.00%

Europe

Eastern Europe

In Eastern Europe, the frequency of haplogroup I is generally lower than in Western Europe, but its frequency is more consistent between populations with fewer places of extreme highs or lows. There are two notable exceptions. Nikitin 2009 found that Lemkos in the Carpathian Mountains have the "highest frequency of haplogroup I in Europe, identical to that of the population of Krk Island in the Adriatic Sea".
PopulationNFrequencySource
Boyko0/200.00%
Hutsul0/380.00%
Lemko6/5311.32%
Belorussians2/922.17%
Russia 3/2151.40%
Romanians 590.00%
Romanians 462.17%
Russia1/502.0%
Ukraine0/180.00%
Croatia 4/2771.44%
Croatia 15/13311.28%
Croatia 1/1050.95%
Croatia 2/1081.9%
Croatia 1/981%
Herzegovinians1/1300.8%
Bosnians6/2472.4%
Serbians4/1173.4%
Macedonians2/1461.4%
Macedonian Romani7/1534.6%
Slovenians2/1041.92%
Bosnians4/1442.78%
Poles8/4361.83%
Caucasus *1/581.80%
Russians5/2012.49%
Bulgaria/Turkey2/1021.96%

Western Europe

In Western Europe, haplogroup I is most common in Northwestern Europe. The frequency in these areas is between 2 and 5 percent. Its highest frequency in Brittany, France where it is over 9 percent of the population in Finistère. It is uncommon and sometimes absent in other parts of Western Europe.
PopulationLanguageNFrequencySource
Austria/Switzerland4/1872.14%
Basque Basque/Labourdin côtier-haut navarrais0/560.00%
Basque Basque/Occidental0/550.00%
Basque Basque/Biscayen1/591.69%
Basque Basque/Haut-navarrais méridional2/633.17%
Basque Basque/Gipuzkoan0/570.00%
Basque Basque/Bas-navarrais0/680.00%
Basque Basque/Haut-navarrais septentrional0/510.00%
Basque Basque/Roncalais-salazarais0/550.00%
Basque Basque/Souletin0/620.00%
Basque Basque/Biscayen0/640.00%
BéarnFrench0/510.00%
BigorreFrench0/440.00%
BurgosSpanish0/250.00%
CantabriaSpanish0/180.00%
ChalosseFrench0/580.00%
Denmark6/1055.71%
England/Wales12/4293.03%
Finland1/492.04%
Finland/Estonia5/2022.48%
France 2/229.10%
France 0/400.00%
France 0/390.00%
France -2/722.80%
France 2/375.40%
France/Italy2/2480.81%
Germany12/5272.28%
Gran Canaria-6/2142.80%
Iceland21/4674.71%
Ireland3/1282.34%
Italy 2/484.20%
La RiojaSpanish1/511.96%
North AragonSpanish0/260.00%
Orkney5/1523.29%
Saami0/1760.00%
Scandinavia12/6451.86%
Scotland39/8914.38%
Spain/Portugal2/3520.57%
Sweden0/370.00%
Western BizkaiaSpanish0/180.00%
Western Isles/Isle of Skye15/2466.50%

Historic and prehistoric samples

Haplogroup I has until recently been absent from ancient European samples found in Paleolithic and Mesolithic grave sites. In 2017, in a site on Italian island of Sardinia was found a sample with the subclade I3 dated to 9124–7851 BC, while in the Near East, in Levant was found a sample with yet-not-defined subclade dated 8850–8750 BC, while in Iran was found a younger sample with subclade I1c dated to 3972–3800 BC. In Neolithic Spain was found a sample with yet-not-defined subclade. Haplogroup I displays a strong connection with the Indo-European migrations; especially its I1, I1a1 and I3a subclades, which have been found in Poltavka and Srubnaya cultures in Russia, among ancient Scythians, and in Corded Ware and Unetice Culture burials in Saxony.I3a has also been found in the Unetice Culture in Lubingine, Germany 2,200 B.C. to 1,800 B.C. courtesy article on Unetice Culture Wikipedia of 2 Skeletons that were DNA tested. Haplogroup I has also been noted at significant frequencies in more recent historic grave sites.
In 2013, Nature announced the publication of the first genetic study utilizing next-generation sequencing to ascertain the ancestral lineage of an Ancient Egyptian individual. The research was led by Carsten Pusch of the University of Tübingen in Germany and Rabab Khairat, who released their findings in the Journal of Applied Genetics. DNA was extracted from the heads of five Egyptian mummies that were housed at the institution. All the specimens were dated to between 806 BC and 124 AD, a time frame corresponding with the Late Dynastic and Ptolemaic periods. The researchers observed that one of the mummified individuals likely belonged to the I2 subclade. Haplogroup I has also been found among ancient Egyptian mummies excavated at the Abusir el-Meleq archaeological site in Middle Egypt, which date from the Pre-Ptolemaic/late New Kingdom, Ptolemaic, and Roman periods.
Haplogroup I5 has also been observed among specimens at the mainland cemetery in Kulubnarti, Sudan, which date from the Early Christian period.

Samples with unknown subclades

The frequency of haplogroup I may have undergone a reduction in Europe following the Middle Ages. An overall frequency of 13% was found in ancient Danish samples from the Iron Age to the Medieval Age from Denmark and Scandinavia compared to only 2.5% in modern samples. As haplogroup I is not observed in any ancient Italian, Spanish, British, central European populations, early central European farmers and Neolithic samples, according to the authors "Haplogroup I could, therefore, have been an ancient Southern Scandinavian type "diluted" by later immigration events".

Subclades

Tree

This phylogenetic tree of haplogroup I subclades with time estimates is based on the paper and published research.
Hg Age estimate 95% confidence interval
N1a1b28.623.5–33.9
I20.118.4–21.9
I116.314.6–18.0
I1a11.69.9–13.3
I1a14.94.2–5.6
I1a1a3.83.3–4.4
I1a1b1.40.5–2.2
I1a1c2.51.3–3.7
I1a1d1.81.0–2.6
I1b13.411.3–15.5
I1c10.38.4–12.2
I1c17.25.4–9.0
I1c1a4.02.5–5.4
I2'312.610.4–14.7
I26.86.0–7.6
I2a4.73.8–5.7
I2a13.22.1–4.4
I2b1.70.5–2.9
I2c4.73.6–5.8
I2d3.01.1–4.8
I2e3.11.4–4.8
I310.68.8–12.4
I3a7.46.1–8.7
I3a16.14.7–7.5
I3b2.61.1–4.2
I3c9.47.6–11.2
I415.112.3–18.0
I4a6.45.4–7.4
I4a15.74.5–6.7
I4b8.45.8–10.9
I518.416.4–20.3
I5a16.014.0–17.9
I5a19.27.1–11.3
I5a212.310.2–14.4
I5a2a1.61.0–2.1
I5a34.82.8–6.8
I5a45.63.5–7.8
I5b8.86.3–11.2
I618.416.2–20.6
I6a5.33.5–7.0
I6b13.110.4–15.8
I79.16.3–11.9

Distribution

I1

It formed during the Last Glacial pre-warming period. It is found mainly in Europe, Near East, occasionally in North Africa and the Caucasus.
It is the most frequent clade of the haplogroup.
Genbank IDPopulationSource
JQ702472
JQ702567Germany
JQ704077Germany
JQ705840
IranFamilyTreeDNA
SpanishFamilyTreeDNA
Poland
SwedishFamilyTreeDNA
GermanyFamilyTreeDNA
RussianFamilyTreeDNA
I1a
The subclade frequency peaks are mostly located in North-Eastern Europe.
Genbank IDPopulationSource
FamilyTreeDNA
Turkey FamilyTreeDNA
Chuvash
Iran
Iran
AssyriansShamoon-Pour 2019
ItalyFamilyTreeDNA
MoroccoColombo 2025
I1a1
Genbank IDPopulationSource
Portugal
Tunisia
Turkey
Morocco
Denmark
Denmark
Spain
CzechFamilyTreeDNA
EnglandFamilyTreeDNA
Poland
ScotlandFamilyTreeDNA
IrelandFamilyTreeDNA
Kazakhs
ScottishFamilyTreeDNA
Canary Islanders
Canary Islanders
Canary Islanders
Canary Islanders
I1a1a
Genbank IDPopulationSource
Finland
Finland
Finland
Finland
Finland
UkrainianFamilyTreeDNA
Poland
Italy
Sardinians
Sardinians
Russia
Russia
Russia
Poland
Poland
Hungary
Poland
Poland
SwedishFamilyTreeDNA
RussianFamilyTreeDNA
FinnishFamilyTreeDNA
Brazil
Brazil
SlovakGrzybowski 2023
SlovakGrzybowski 2023
SlovakGrzybowski 2023
CzechGrzybowski 2023
CzechGrzybowski 2023
Poland
Poland
I1a1a1
I1a1a2
I1a1a3
I1a1a3a
I1a1b
Genbank IDPopulationSource
Denmark
Denmark
IrelandFamilyTreeDNA
SwedesFamilyTreeDNA
EnglishFamilyTreeDNA
ScotlandFamilyTreeDNA
FinlandFamilyTreeDNA
IrelandFamilyTreeDNA
IrelandFamilyTreeDNA
IrelandFamilyTreeDNA
WelshFamilyTreeDNA
FinlandFamilyTreeDNA
IrelandYSEQ
IrelandFamilyTreeDNA
Shetland
I1a1c
Genbank IDPopulationSource
Mishar Tatars
Ukraine
GermanFamilyTreeDNA
LithuaniaFamilyTreeDNA
Brazil
CzechGrzybowski 2023
Belarusian
Russian
I1a1d
Genbank IDPopulationSource
WalesFamilyTreeDNA
United KingdomFamilyTreeDNA
Orkney
EnglishFamilyTreeDNA
WalesFamilyTreeDNA
I1a1e
I1b
Genbank IDPopulationSource
Caucasian
India
Jewish Diaspora
ArmenianFamilyTreeDNA
FamilyTreeDNA
Iran
Italy
Iran
Iran
Italy
SwedishFamilyTreeDNA
GermanFamilyTreeDNA
ArmenianFamilyTreeDNA
Yemen
Yemen
HungaryFamilyTreeDNA
Sardinians
Russia
SwedenFamilyTreeDNA
Armenian
Thailand
ChechenFamilyTreeDNA
Armenian
Pakistan
EnglishFamilyTreeDNA
SlovakGrzybowski 2023
Saudi ArabiaFamilyTreeDNA
PakistanBukhari 2025
I1c
GenBank IDPopulationSource
TurkishFamilyTreeDNA
SpainFamilyTreeDNA
Mongolia
TurkeyYSEQ
I1c1
I1c1a1
I1c1a2
I1d
I1e
I1f

I2'3

It is the common root clade for subclades I2 and I3. There's a sample from Tanzania with which I2'3 shares a variant at position 152 from the root node of haplogroup I, and this "node 152" could be upstream I2'3s clade. Both I2 and I3 might have formed during the Holocene period, and most of their subclades are from Europe, only few from the Near East. Examples of this ancestral branch have not been documented.
I2
GenBank IDPopulationSource
IrelandFamilyTreeDNA
FamilyTreeDNA
Volga Tatars
FamilyTreeDNA
Chechnya
Czech
Turkey
Denmark
Denmark
Iranian Azerbaijanis
Italy
Ukraine
Italy
Italy
ChechensFamilyTreeDNA
ArmeniansFamilyTreeDNA
NorwayFamilyTreeDNA
ItalyFamilyTreeDNA
UyghursZheng 2018
UyghursZheng 2018
UyghursZheng 2018
UyghursZheng 2018
NorwayFamilyTreeDNA
SwedenFamilyTreeDNA
SwedenFamilyTreeDNA
EnglandFamilyTreeDNA
Russia
Russia
IrishFamilyTreeDNA
GermanyFamilyTreeDNA
SwedenFamilyTreeDNA
SwedenFamilyTreeDNA
EnglandFamilyTreeDNA
Armenians
EnglandFamilyTreeDNA
IrelandFamilyTreeDNA
IrelandFamilyTreeDNA
Shetland
Orkney
NetherlandsFamilyTreeDNA
IrelandFamilyTreeDNA
IrelandFamilyTreeDNA
I2a
GenBank IDPopulationSource
FamilyTreeDNA
FamilyTreeDNA
ScotlandFamilyTreeDNA
NorwegianFamilyTreeDNA
FamilyTreeDNA
I2a1
GenBank IDPopulationSource
Finland
IrelandFamilyTreeDNA
IrelandFamilyTreeDNA
I2a1a
I2a2
I2a3
I2b
GenBank IDPopulationSource
Finland
Finland
Finland
Finland
I2c
GenBank IDPopulationSource
FranceFamilyTreeDNA
ScotlandFamilyTreeDNA
ScotlandFamilyTreeDNA
Shetland
Shetland
Orkney
Orkney
ScotlandFamilyTreeDNA
Canary Islanders
I2d
GenBank IDPopulationSource
Denmark
GermansFamilyTreeDNA
Poland
EnglishFamilyTreeDNA
Poland
Poland
FinlandFamilyTreeDNA
I2e
GenBank IDPopulationSource
Poland
Poland
I3
GenBank IDPopulationSource
IrelandFamilyTreeDNA
I3a
GenBank IDPopulationSource
FranceFamilyTreeDNA
FamilyTreeDNA
Greece
Denmark
Iran
FlemishFamilyTreeDNA
EnglishFamilyTreeDNA
BelgiumFamilyTreeDNA
ItalyFamilyTreeDNA
Sardinians
GermanyFamilyTreeDNA
IrishFamilyTreeDNA
FranceFamilyTreeDNA
ScotlandFamilyTreeDNA
ScotlandFamilyTreeDNA
Canary Islanders
Canary Islanders
Canary Islanders
Canary Islanders
Canary Islanders
Canary Islanders
NorwegiansFamilyTreeDNA
I3a1
GenBank IDPopulationSource
ItalyBandelt
FranceFamilyTreeDNA
AustriaFamilyTreeDNA
Poland
Brazil
I3b
GenBank IDPopulationSource
IrelandFamilyTreeDNA
I3c
I3d

I4

The clade splits into subclades I4a and newly defined I4b, with samples found in Europe, the Near East and the Caucasus.
GenBank IDPopulationSource
PolishFamilyTreeDNA
I4a
GenBank IDPopulationSource
FamilyTreeDNA
ArmenianFamilyTreeDNA
North Ossetia
Calabria, Italy
Denmark
GermanFamilyTreeDNA
French CanadiansFamilyTreeDNA
IrelandFamilyTreeDNA
IsraelFamilyTreeDNA
ArmenianFamilyTreeDNA
CroatiaFamilyTreeDNA
Armenians
EnglishFamilyTreeDNA
ChelkansNazhmidenova 2020
Poland
Poland
Orkney
Orkney
Orkney
Orkney
Orkney
Brazil
FamilyTreeDNA
EnglandFamilyTreeDNA
GermanFamilyTreeDNA
FamilyTreeDNA
MoroccoColombo 2025
I4a1
I4a2
I4b

I5

Is the second most frequent clade of the haplogroup. Its subclades are found in Europe, e.g. I5a1, and the Near East, e.g. I5a2a and I5b.
GenBank IDPopulationSource
German FamilyTreeDNA
North Ossetia
German FamilyTreeDNA
German FamilyTreeDNA
PolishFamilyTreeDNA
Serbia
I5a
GenBank IDPopulationSource
Pontic GreeksFamilyTreeDNA
FinnishFamilyTreeDNA
Armenians
FinnishFamilyTreeDNA
I5a1
GenBank IDPopulationSource
Bulgaria
Finland
FinlandFamilyTreeDNA
Serbia
AssyriansShamoon-Pour 2019
GermanyFamilyTreeDNA
I5a1a
I5a1b
I5a1c
I5a1g
I5a2
I5a2a
I5a3
I5a4
I5b
I5c
I5c1

I6

The subclade is very rare, found until July 2013 only in four samples from the Near East.
GenBank IDPopulationSource
Turkey
CroatiaFamilyTreeDNA
Moroccan ArabColombo 2025
I6a
GenBank IDPopulationSource
ItalyFamilyTreeDNA
Sardinians

I7

It is the rarest defined subclade, until July 2013 found only in two samples from the Near East and the Caucasus.
GenBank IDPopulationSource
ArmenianFamilyTreeDNA
Kuwait

Works cited

Journals

*