Peopling of China
In the course of the peopling of the World by Homo sapiens, East Asia was reached about 50,000 years ago. The "recent African origin" lineage from 70 kya diverged into identifiable East Eurasian and West Eurasian lineages by about 50 kya. The East Eurasian ancestors of East Asians used a southern route to reach South and Southeast Asia, along which they rapidly diverged into the ancestors of Ancient Ancestral South Indians, Papuans, East Asians and Andamanese peoples. This early East Asian lineage diverged further during the Last Glacial Maximum, spreading northwards from Mainland Southeast Asia where it significantly contributed to the peopling of the Americas via Beringia about 25 kya. After the last ice age, China was cut off from neighboring island groups. The phenotypes of early East Asians were either replaced or prevailed among more geographically distant groups.
Genetic history
Overview
A "Basal-East Asian population", referred to as the East and Southeast Asian lineage, is ancestral to the Hoabinhian hunter-gatherers of Southeast Asia and the Tianyuan lineage found in Northern China and also, modern East Asians, Southeast Asians, Polynesians, and Siberians. The ESEA lineage descend from an earlier "eastern non-African" or "Ancient East Eurasian" meta-population, which used a single southern route to reach South, Southeast Asia, and Oceania, and along which they rapidly diverged into the ancestors of Ancient South Asians, East/Southeast Asians, as well as Australasians. This ESEA lineage later diverged into the Hoabinhian, the Tianyuan, and Ancient East Asian lineages, and expanded northward. The Ancient East Asian lineage later split into Ancient Southern East Asians and Ancient Northern East Asians. There is "a strong correlation with latitude, with diversity decreasing from south to north".Tianyuan-related populations were widespread in Northern East Asia although there is evidence of a 'southern branch' in Southern East Asia. Near the end of the Last Glacial Maximum, the oldest individual with the EDAR_V370A variant, AR19K, emerged from the Amur River region. This variant was absent in preceding populations but widespread in later Northern East Asian populations. Populations like the Jōmon and Papuans also lacked this variant. Around the same time, an AR19K-related population demographically replaced the Tianyuan-related population, causing Tianyuan-like ancestry in Northern East Asia to 'disappear'. This AR19K-related population was basal to younger samples of Ancient Northern East Asians, who form a sister lineage to Ancient Southern East Asians. Genetic divergence between Ancient Northern and Southern East Asians occurred about 19,000 years ago. About 14,000 years ago, there is genetic continuity between Amur River populations and Neolithic populations from the Devil's Gate Cave, suggesting population interaction. These Amur River populations were also suggested to be the source of East Asian ancestry found in Ancient Paleo-Siberians. Ancient Paleo-Siberian-related ancestry persists in Native Americans, Uralic and Yeniseian-speaking groups and to a lesser extent, Turkic, Mongolic and Tungusic-speaking populations. So far, Qihe and AR19K are proxies of the earliest Ancient Southern and Northern East Asians respectively. Individuals like Boshan and Shimao are proxies of coastal and inland Northern East Asian ancestry respectively.
However, a 2025 study suggests that there was greater persistence of deeply diverged Basal Asian populations, such as Early Neolithic Xingyi and Hoabinhian, in Southern East Asia until they significantly mixed with Southern East Asian-related populations in the mid-Holocene, compared to Northern East Asia. This can be explained by warmer and more hospitable environments in Southern East Asia. The ancestors of ancient East Asians were also suggested to be a mixture of Tianyuan-related and Early Neolithic Xingyi-related lineages. Several ancient Northeast Asian individuals from inland East Asia and the Devil's Gate Cave can also be modeled as mixtures of deep lineages that are ancestral to the Jōmon and Tianyuan respectively, despite the latter being more related to the Jōmon.
Archaeogenetic studies in the Central Plains
Neolithic Northern China can be divided into 4 periods: the Pre-Peiligang period, the Peiligang period, the Yangshao period, and the Longshan period. The first two correlate with the initial development of Neolithic Chinese culture whilst the latter two correlate with the accelerated development of the Chinese civilization. Northern China had many cradles of civilizations, including one in Shandong, China, which was characterized by the following cultures: Houli culture, Beixin culture, Dawenkou culture, Shandong Longshan culture and Yueshi culture.The progenitors of the Han Chinese were Neolithic Yellow River farmers from the Central Plains, who were closely related to ancient Inland East Asians, represented by the Yumin individual. They recently descend from Proto-Sino-Tibetan-speaking groups in the middle Yellow River regions, who diverged into Sinitic and Tibeto-Burman groups about 4-8 kya. Sinitic groups were the ancestors of Han Chinese and exhibit low-level admixture with Siberian-related groups since the Neolithic period. Meanwhile, Tibeto-Burman groups exhibit admixture with Paleolithic hunter-gatherers from the Tibetan Plateau. Yangshao culture-related Middle Neolithic Yellow River groups show high genetic similarities with the Wenshaobei population from Shandong, China. The latter can be modeled as having Yellow River ancestry and Southern East Asian-related ancestry. Yellow River groups are genetically distinguishable from Early Neolithic Shandong populations, who have affinities with Ancient Northeast Asians from the Amur River Basin and Mongolian Plateau, as well as Yayoi and Tibetan populations, but are nonetheless distinct. They are believed to precede ANA groups from coastal northern China. Populations from coastal northern China also contributed to the East Asian ancestries of East Asians, Jōmon peoples, East Siberians and Native Americans since the Last Glacial Maximum. Since the Dawenkou period, Neolithic Yellow River populations expanded from the Central Plains and replaced these Early Neolithic Shandong populations, although this is more pronounced in inland areas.
According to a 2025 study, Middle Neolithic Yellow River groups exhibit distinct substructure. For example, lower Yellow River groups are characterized by Middle Neolithic Yellow River ancestry, hunter-gatherer ancestry from Early Neolithic coastal northern China, southern East Asian-related ancestry and ANA-related ancestry. Meanwhile, upper Yellow River groups cluster with modern highland populations from the Tibetan Plateau and are characterized by Middle Neolithic Yellow River ancestry, Zongri5.1k-related ancestry and ANA-related ancestry. Conversely, middle Yellow River groups do not exhibit significant admixture with non-Yellow River groups although admixture with ANA-related groups peak in higher-latitude regions. Zongri5.1k-related ancestry is also found in some local sites from Shaanxi and Shanxi. Ancient Paleo-Siberian-related ancestry is additionally detected in several ancient Northern Chinese populations, including populations from the upper and middle Yellow River and West Liao River regions.
Successors of Middle Neolithic Yellow River groups show genetic continuity. For example, Late Neolithic Yellow River groups from the Central Plains could be modeled as having Middle Neolithic Yellow River-related ancestry and Taiwan Hanben-related ancestry. Their successors show similar amounts of Hanben-related ancestry. Likewise, historical Shandong populations show similar genetic compositions as Late Neolithic and Late Bronze Age to Iron Age Yellow River groups and are characterized by a mixture of Northeast Asian-related ancestry, ancient highland-related ancestry, southern East Asian-related and to some extent, Ancient North Eurasian-related ancestry although not all studies support this. Among modern Han subgroups, Han Chinese from Henan, Shandong and Shanxi show the most continuity with post-Late Neolithic Yellow River groups and can even be modeled as their direct descendants. Other Han subgroups, especially in the south, have more southern East Asian-related ancestry, although studies show additional matrilineal admixture with neighboring minorities in northern and southern China respectively. Siberian and West Eurasian-related ancestry is also detected in some modern Northwest and Central Han Chinese, along with highland Tibetan-related ancestry, which is also found in some modern Northeast Chinese. However, Han Chinese from provinces like Inner Mongolia and Shaanxi can be differentiated from other Northern Han Chinese by their high Northeast Asian-related ancestry although they show some Southern East Asian-related and West Eurasian-related ancestry too. Besides the Han Chinese, populations who are closely related to Neolithic Yellow River farmers include Naxi, Yi, Gelao, indigenous populations from the Himalayan region and western Japanese people.
Archaeogenetic studies in southern China
According to Wang et al., these ancient individuals from southern China play a key role in the ethnogenesis of present-day southern Chinese populations and the Austroasiatic and Austronesian diaspora:- Archaic individuals from 12,000 to 10,000 BP:
- * Qihe-3 is an Upper Paleolithic individual from the mountainous interior of Fujian, located about 100 km north of present-day Zhangzhou city. Like other late Homo sapiens, Qihe-3 exhibited a long head, large cranial capacity, high narrow face, broad and low/short nasal shape and exhibits features from northern and southern populations in Neolithic China. They also have large prominent cheekbones, flat upper faces, thin narrow cheeks, larger heads and facial hair. Their features are not necessarily representative of populations living during the Neolithic and Paleolithic transition but rather, reflect the great phenotypic variation that exists. Like the Qihe-2 individual, Qihe-3 clusters closely with Austronesians. Both individuals also cluster with Boshan from Neolithic Shandong. Qihe-3 is genetically indistinguishable from Liangdao-2 but can also be modeled as a mixture of coastal Neolithic East Asian ancestry and deeply diverging East Eurasian ancestry. However, this 2-way model is not a better fit than the 1-way model. The Qihe individuals can also be modeled as a mixture of Longlin/Dushan-related and Tianyuan-related ancestries and in some cases, a mixture of Longlin and Dushan-related ancestries.
- * Qihe-2, a more recent specimen from a different layer of the same site dating to 8,428-8,359 cal BP, was sequenced and found to be closely related to Iron Age Taiwanese and Austronesians like Qihe-3.
- * Longlin is an Upper Paleolithic individual with deeply diverging East Asian ancestry. Longlin is closely related to the Maludong, or Red Deer Cave people, and Ikawazu, a Jōmon individual. They are also closely related to Ancient East Asians from Shandong and Fujian than to basal lineages like Hoabinhian before the former split into present Northern East Asians and Southern East Asians. Longlin, Ikawazu and Ancient East Asians likely all diverged from each other at the same time, with Longlin being located at the basal position on the lineage leading to M71d, sharing a maternal genetic connection with present-day populations from mainland Southeast Asia. Although diverged from Ancient East Asians like Jōmon, they were not geographically isolated. Longlin can also be modeled as an admixture of Early Neolithic Xingyi, which is closely related to deeply diverged ghost ancestry found in ancient Tibetans, and Qihe-3-related East Asian ancestry. Despite Longlin's uniquely archaic phenotypes, they carried similar levels of archaic human ancestry as Neolithic and present-day East Asians although it is likely that their phenotypes arose from hybridization with archaic hominins.
- * Liangdao-2, was found to have mostly Qihe-3-related ancestry, along with northern East Asian ancestry, associated with Neolithic Shandong and other northern East Asian sites. Liangdao-2 lacks Basal Sunda/Australasian ancestry. Cordillerans, who are the least admixed group among the Austronesians of East Asia, are closely related to Liangdao-2 and diverged from Taiwanese aborigines quite early. They also lack the northern East Asian ancestry that was introduced in Liangdao-2 although it was later introduced in northern Philippines. However, there is evidence that Out-of-Taiwan groups have slightly more northern East Asian ancestry than Into-Taiwan groups, with present Kankanaey groups having ~33% northern East Asian ancestry, similar to what is found in Taiwan Highland/Taiwan Orchid Island groups, who have ~28–37% northern East Asian ancestry. These findings suggest considerable northern East Asian influence in Taiwan prior to the Out-of-Taiwan expansions. A 2025 study showed genetic influence from Neolithic Shandong populations in the 'proto-Austronesian' population from southeastern China. In addition, there is evidence of some Austronesian groups from northern Philippines having low-level European ancestry.
- Archaic individuals from 9,000 to 6,000 BP:
- * Dushan is a male individual that can be modeled as a mixture of Longlin-related ancestry and Qihe-related ancestry. This suggests a mass migration of ancient Fujian populations into Guangxi, where they intermixed with the indigenous inhabitants rather than completely replacing them. Overall, Dushan has higher affinities with Mán Bạc populations from Late Neolithic Vietnam, Late Neolithic Fujianese populations such as Xitoucun and Tanshishan and present Austroasiatics.
- ** Compared to both Qihe individuals, 4,100–2,000 year-old Late Neolithic Fujianese populations like Xitoucun and Tanshishan are closely related to Dushan-related populations. They can be modeled as a mixture of Dushan-related ancestry, Qihe3-related ancestry, northern East Asian-related ancestry and deep ancestry represented by Indus Periphery populations. Taiwan Hanben populations are also closely related to Dushan-related populations. This highlights the significant role of Dushan-related ancestry in prehistoric southern China. The Longli Bouyei and Qiandongnan Dong, who are considered to be the ancestral Kra-Dai population, cluster with the aforementioned populations along with Gongguan populations from Taiwan and Kinh Vietnamese. However, the Hlai are considered to be one of the least admixed Kra-Dai populations since they did not heavily mix with ancient Guangxi populations and Han Chinese and cluster more with Austronesians with divergent ancestry like Ami, Atayal and Kankanaey.
- * Baojianshan can be modeled as a mixture of 72.3% Dushan-related ancestry and 27.7% Hoabinhian-related ancestry.
- Historical Guangxi populations from 1,500 to 500 BP:
- *Layi can be described as a mixture of Boshan-related ancestry and either Longlin-related or Dushan-related ancestry.
- *Shenxian can be described a mixture of northern East Asian-related ancestry and southern East Asian-related ancestry.
- *Yiyang can be modeled as a mixture of northern East Asian, and southern East Asian ancestry. However, Yiyang can also be modeled as a mixture of northern East Asian-related ancestry and Dushan-related ancestry.
- *LaCen can be described as a mixture of northern East Asian ancestry and Dushan-related ancestry.
- *BaBanQinCen can be described as a mixture of ancestry related to Dushan, northern East Asians and southern East Asians. They significantly contributed to the genetic makeup of present Kra-Dai groups in Guangxi. Zhuang and Dong from Congjiang County in Guizhou, China also cluster with BaBanQinCen, along with GaoHuaHua.
- *GaoHuaHua can be described as a mixture of northern East Asian ancestry and Dushan-related ancestry. They significantly contributed to the genetic makeup of Hmong-Mien groups in Guangxi. Zhuang and Dong from Congjiang County in Guizhou, China also cluster with GaoHuaHua, along with BaBanQinCen.