search this blog

Monday, July 24, 2017

The crisis

Correct me if I'm straying from the facts, but the 4300–3800 YBP date mentioned in this new paper at Eurasian Soil Science, on the "catastrophic aridization" of the steppes in the Lower Volga region, is roughly the time when big, tall, round headed folks rich in Yamnaya-related ancestry basically hijack the Beaker phenomenon, and just before the collapse of the Indus Valley Civilization and, according to most sane people, the arrival of Indo-Europeans in South Asia. Coincidence?

Abstract: Diagnostic features of a catastrophic aridization of climate, desertification, and paleoecological crisis in steppes of the Lower Volga region have been identified on the basis of data on the morphological, chemical, and microbiological properties of paleosols under archeological monuments (burial mounds) of the Middle Bronze Age. These processes resulted in a certain convergence of the soil cover with transformation of zonal chestnut (Kastanozems) paleosols and paleosolonetzes (Solonetz Humic) into specific chestnut-like eroded saline calcareous paleosols analogous to the modern brown desert-steppe soils (Calcisols Haplic) that predominated in this region 4300–3800 years ago. [1] In the second millennium BC, humidization of the climate led to the divergence of the soil cover with secondary formation of the complexes of chestnut soils and solonetzes. This paleoecological crisis had a significant effect on the economy of the tribes in the Late Catacomb and Post-Catacomb time stipulating their higher mobility and transition to the nomadic cattle breeding.

Demkina et al., Paleoecological crisis in the steppes of the Lower Volga region in the Middle of the Bronze Age (III–II centuries BC), July 2017, Volume 50, Issue 7, pp 791–804

See also...

Swat Valley "early Indo-Aryans" at the lab

The Bell Beaker Behemoth (Olalde et al. 2017 preprint)

Corded Ware origin of a big chunk of Finnish mtDNA (Oversti et al. 2017)

Over at Scientific Reports at this link. Emphasis is mine. Corded Ware people were in all likelihood early Indo-European speakers and belonged, perhaps almost exclusively, to Y-chromosome haplogroup R1a, while present-day Finns obviously speak a Uralic language and mostly belong to Y-chromosome haplogroups N1c and I1. But Finns do show a lot of Corded Ware- or Yamnaya-related genome-wide ancestry, so it shouldn't be surprising that a large part of their maternal ancestry is derived from the Corded Ware population.

Abstract: In Europe, modern mitochondrial diversity is relatively homogeneous and suggests an ubiquitous rapid population growth since the Neolithic revolution. Similar patterns also have been observed in mitochondrial control region data in Finland, which contrasts with the distinctive autosomal and Y-chromosomal diversity among Finns. A different picture emerges from the 843 whole mitochondrial genomes from modern Finns analyzed here. Up to one third of the subhaplogroups can be considered as Finn-characteristic, i.e. rather common in Finland but virtually absent or rare elsewhere in Europe. Bayesian phylogenetic analyses suggest that most of these attributed Finnish lineages date back to around 3,000–5,000 years, coinciding with the arrival of Corded Ware culture and agriculture into Finland. Bayesian estimation of past effective population sizes reveals two differing demographic histories: 1) the ‘local’ Finnish mtDNA haplotypes yielding small and dwindling size estimates for most of the past; and 2) the ‘immigrant’ haplotypes showing growth typical of most European populations. The results based on the local diversity are more in line with that known about Finns from other studies, e.g., Y-chromosome analyses and archaeology findings. The mitochondrial gene pool thus may contain signals of local population history that cannot be readily deduced from the total diversity.

Oversti et al., Identification and analysis of mtDNA genomes attributed to Finns reveal long-stagnant demographic trends obscured in the total diversity, Scientific Reports, Published online: 21 July 2017, doi:10.1038/s41598-017-05673-7

See also...

Baltic Corded Ware: rich in R1a-Z645

Neolithic transition in the Baltic

The genetic history of Northern Europe (or rather the South Baltic)

Thursday, July 20, 2017

The Out-of-India Theory (OIT) challenge: can we hear a viable argument for once?

Recent weeks have seen a rash of activity from OIT proponents defending their "truth", largely as a response to a news feature in The Hindu on new genetic evidence backing the Aryan Invasion or Migration Theory (AIT/AMT). A few examples:

Genetics Might Be Settling The Aryan Migration Debate, But Not How Left-Liberals Believe

Genetics and the Aryan invasion debate

Propagandizing the Aryan Invasion Debate: A Rebuttal to Tony Joseph

Here We Go Again: Why They Are Wrong About The Aryan Migration Debate This Time As Well

The problematics of genetics and the Aryan issue

Too early to settle the Aryan migration debate?

The people who wrote these articles are able to string sentences together in a reasonable way, but apart from that, their efforts are clumsy at best. Not only do they not appear to completely understand what they're attempting to debunk, but they also fail to offer an OIT that realistically incorporates new findings from ancient and modern-day DNA.

AIT/AMT is now firmly backed by ancient DNA from Eastern Europe and high resolution modern-day DNA from South Asia. To quote myself from a week ago:

During the past couple of years ancient DNA has revealed the presence of Y-chromosome haplogroup R1a in Eastern European remains dated to the Mesolithic, Neolithic, Eneolithic and Bronze Age. Moreover, the Bronze Age remains, packed in ancestry derived from Eastern European hunter-gatherers (or EHG) and totally lacking any sort of South Asian admixture, belong to R1a-Z645, which is the ancestral clade of by far the most common types of R1a in Europe and South Asia today: R1a-Z282 and R1a-Z93, respectively. And on top of that, South Asians, especially those speaking Indo-European languages, show significant admixture derived from EHG.

The conclusion from this data is self-evident: during the Bronze Age R1a-Z645 became a very important Y-chromosome lineage in Europe and quickly moved to South Asia, in all likelihood on the back of the Indo-European expansion.

Pre-Indo-European Eastern Europe and South Asia were not the same world; they were world's apart. Thus, you will never read anything like this, no matter how much ancient DNA from South Asia is sequenced:

During the past couple of years ancient DNA has revealed the presence of Y-chromosome haplogroup R1a in South Asian remains dated to the Mesolithic, Neolithic, Eneolithic and Bronze Age. Moreover, the Bronze Age remains, packed in ancestry derived from South Asian hunter-gatherers, and totally lacking any sort of European admixture, belong to R1a-Z645, which is the ancestral clade of by far the most common types of R1a in Europe and South Asia today: R1a-Z282 and R1a-Z93, respectively. And on top of that, Europeans, especially those speaking Indo-European languages, show significant admixture derived from South Asian hunter-gatherers.

So, OIT proponents, what counter-arguments can you offer? And can you come up with a new vision for OIT that coherently takes into account ancient DNA from Eastern Europe?

However, to ensure that the debate is a fruitful one not derailed regularly by anti-AIT/pro-OIT red herrings, let's take care of the most obvious of these red herrings now. I reserve the right to delete any comments that attempt to go down these tired, irrelevant avenues without a very good excuse for doing so.

You: So and so found Y-haplogroup P* and other basal clades upstream of R1a in Papuans, therefore R1a and Indo-Europeans are from South Asia. Me: Nonsense. R1 and R1a are found in the remains of Eastern European Mesolithic foragers. Were these individuals recently arrived Indo-European-speakers from South Asia? Try harder.

You: It doesn't matter that Eastern European Mesolithic foragers belonged to R1a, because the most common form of R1a in the world is R1a-M417, and if it originated in India then OIT is a reality. Me: But what are the chances realistically that R1a-M417 is from India or South Asia, considering that prehistoric European samples, with absolutely no signals of ancestry from South Asia, belong to both M417+ and M417- lineages? In fact, Europe is the most likely homeland of R1a-M417.

You: India has incredible diversity in R1a, therefore it's the R1a and Indo-European homeland. Me: No it doesn't. India, and indeed, South Asia as a whole are dominated by one fairly young subclade: R1a-Z93. Europe is home to three different subclades that show up at perceptible frequencies: R1a-Z282, found throughout much of the continent; R1a-L664, mostly confined to Northwestern Europe; and R1a-Z93, mostly confined to far Eastern Europe.

You: Many unique Indian ethnic groups are yet to be tested genetically. They may show surprising results, including new subclades of R1a. Me: If you dig hard enough, you'll always find some exceptions to the rule. But how do you know where the ancestral lineages of such exceptions in South Asia were during, say, the Neolithic? What makes you think they were in South Asia? To prove that South Asia is the homeland of its by far most dominant R1a subclade, R1a-Z93, then at the very least you need to show that other, closely and distantly related subclades, are also found at perceptible frequencies in whole regions of South Asia, and therefore that they have some sort of history there. Otherwise we can safely assume that R1a-Z93 and the few exceptions to the R1a-Z93 rule in South Asia are relative latecomers from somewhere else.

You: But we have no ancient DNA from South Asia yet, and it may produce a huge shock. Me: For you yes, but not for me. What are the chances realistically that R1a was present among both European and South Asian foragers? I'd say practically zero. Feel free to raise it to a few per cent to make yourself feel better, but we both know the hard reality.

You: Ancient DNA from South Asia might show that Northern India was home to a population very similar to Yamnaya, and if so, then the Yamnaya-related ancestry in modern-day Indians is native to India. Me: There's no logic behind this. Yamnaya and other closely related Bronze Age groups were very specific mixtures of Mesolithic foragers and Neolithic farmers living in Eastern Europe and surrounds. There's absolutely no reason to assume that such unique mixtures would also form independently in South Asia, or even outside of Europe's generally accepted borders.

You: Bronze Age Europeans who belonged to R1a also carried southern admixture from Iran, or maybe even India. Me: In prehistoric samples, R1a is always highly correlated with Eastern European Hunter-Gatherer (EHG) ancestry, so positing that it also arrived in Europe with a southern population makes no sense. And why would this southern ancestry be from Iran or India? Why not the Caucasus? We know from ancient DNA that the type of southern ancestry that these ancient Europeans carried has been sitting in the Caucasus since the Upper Paleolithic. Moreover, they lack South Caspian- and South Asian-specific markers such as mtDNA haplogroup U7. How were such markers purged from their gene pool if they or their recent ancestors arrived in Europe from Iran or India?

You: Chickens and mice came from South Asia, therefore Indo-Europeans came from South Asia too. Me: Bullshit. Do better or go away.

Does anyone want to claim that I don't know what I'm talking about? Or perhaps that I'm just putting out Eurocentric propaganda? If you don't understand my arguments, and that they're indeed very solid arguments, then there's no hope for you. Go and find a new hobby or profession, because you're not cut out for this.

OK, now what we have the formalities out of the way, who wants to have a go at salvaging OIT in the comments? Don't be shy.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Monday, July 17, 2017

On the Mesolithic colonization of Scandinavia (Günther et al. 2017 preprint)

Over at bioRxiv at this link. The main takeaway point from this preprint is that Scandinavia was a more happening place than most of the rest of Europe during the Mesolithic, because at the time it was the meeting place between two relatively divergent forager groups, West European hunter-gatherers (WHG) and East European hunter-gatherers (EHG), that entered the peninsula from different directions, the southwest and northeast, respectively, and mixed to form Scandinavian hunter-gatherers (SHG). Other key points:

- EHG probably dispersed across Scandinavia in a counter-clockwise direction via an ice-free route along the Atlantic coast in what is now Norway, because SHG samples from northern and western Scandinavia show more EHG ancestry than those from southern and eastern Scandinavia

- at least 17% of the SNPs that are common in SHG are not found in present-day Europeans, suggesting that a large part of European variation has been lost since the Mesolithic

- although it's unlikely that SHG made a significant contribution to the present-day Northern European gene pool, some gene-variants common in SHG that appear to be associated with metabolic, cardiovascular, developmental and psychological traits are carried at high frequencies by present-day Northern Europeans, especially compared to present-day Southern Europeans, probably due to strong selective pressures specific to northern latitudes in Europe

- SHG is inferred to have had fair skin and varied blue to light-brown eye color, which makes sense considering that it was a mixture of apparently fair-skinned/brown-eyed EHG and dark-skinned/blue-eyed WHG, except that the frequencies of blue-eyed variants and one fair-skinned variant in SHG are much higher than expected from its EHG/WHG mixture ratios, again pointing to strong selective pressures specific to northern latitudes in Europe acting upon certain gene-variants

- a 3D computer generated facial reconstruction of an SHG female based on data from a very high (57x) coverage genome sequence looks, at least to me, like a fairly typical present-day Northern European woman (see Figure S9.1 in the supp info here), though I suspect that the result might be biased in some way, simply because it's impossible to know whether variants associated with specific facial traits in present-day Northern Europeans were also associated with the same facial traits in SHG.


Günther et al., Genomics of Mesolithic Scandinavia reveal colonization routes and high-latitude adaptation, bioRxiv, Posted July 17, 2017, doi:

Sunday, July 16, 2017

North European admixture in the Han Chinese (Charleston et al. 2017 preprint)

Over at bioRxiv at this link. Emphasis is mine. The estimated date of the North European-related admixture signal is probably much too late. These sorts of estimates always look way off. And I doubt that it's largely the result of the Silk Road, which linked China to the Near East and Mediterranean rather than to Northern Europe. More likely it reflects gene flow from the Pontic-Caspian steppe in Eastern Europe during the Bronze and Iron ages, via the Afanasievo, Andronovo, and other closely related steppe peoples (see here).

Abstract: As are most non-European populations around the globe, the Han Chinese are relatively understudied in population and medical genetics studies. From low-coverage whole-genome sequencing of 11,670 Han Chinese women we present a catalog of 25,057,223 variants, including 548,401 novel variants that are seen at least 10 times in our dataset. Individuals from our study come from 19 out of 22 provinces across China, allowing us to study population structure, genetic ancestry, and local adaptation in Han Chinese. We identify previously unrecognized population structure along the East-West axis of China and report unique signals of admixture across geographical space, such as European influences among the Northwestern provinces of China. Finally, we identified a number of highly differentiated loci, indicative of local adaptation in the Han Chinese. In particular, we detected extreme differentiation among the Han Chinese at MTHFR, ADH7, and FADS loci, suggesting that these loci may not be specifically selected in Tibetan and Inuit populations as previously suggested. On the other hand, we find that Neandertal ancestry does not vary significantly across the provinces, consistent with admixture prior to the dispersal of modern Han Chinese. Furthermore, contrary to a previous report, Neandertal ancestry does not explain a significant amount of heritability in depression. Our findings provide the largest genetic data set so far made available for Han Chinese and provide insights into the history and population structure of the world's largest ethnic group.


One finding from our analysis of admixture signals that most likely fit a one-pulse admixture model is our observation of admixture from Northern European populations to the Northwestern provinces of China (Gansu, Shaanxi, Shanxi), but not other parts of China. Previous analysis of the HGDP data, based on patterns of haplotype sharing among 10 Han Chinese from Northern China, estimated a single pulse of ~6% West Eurasian ancestry among the Northern Han Chinese. The estimated date of admixture was around 1200 CE. This signal is also observed among the Tu people, an ethnic minority also from Northwestern China; the authors attributed this signal to contact through the Silk Road (Hellenthal et al. 2014). We estimate a lower bound of admixture proportion due to Northern Europeans at approximately 2%-5%, with an admixture date of about 26 +/-3 generations for Gansu, and 47 +/-3 generations for Shaanxi [Table S8]. Using a generation time of about 26-30 years (Moorjani et al. 2016), these estimates correspond to admixture events occurring at around 700 CE and 1300 CE, respectively, corresponding roughly to the Tang and Yuan dynasty in China. However, these estimated dates should be interpreted with caution, as both the violation of a single pulse admixture model and the additional noise in inter-­marker LD estimates due to low coverage data could bias the estimates.

Charleston et al.,A comprehensive map of genetic variation in the world's largest ethnic group - Han Chinese, bioRxiv, Posted July 13, 2017, doi:

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Saturday, July 15, 2017

ASHG 2017 crowdfunding drive

The annual American Society of Human Genetics (ASHG) meeting is being held in Orlando, FL, on October 17-21 (see here). The talk abstracts aren't online yet, but it's likely to be a big event for those of us interested in ancient DNA, so we'd like to get Chad Rohlfsen down there to run live reports from some of the talks.

But we'll need about $2000 to make it happen. Donations can be sent via PayPal to c_rohlfsen [at] hotmail [dot] com.

We'll try to organize the press pass for Chad next week. If, for whatever reason, Chad doesn't make it to the meeting, he'll promptly refund the donations. It's the same deal as back in 2015 when Chad went to St. Louis to cover the AAPA 2015 conference (see here). But I have a strong feeling that this is going to be much bigger. Let's make it happen.

Wednesday, July 12, 2017

Indian confirmation bias

In a largely fact free but obfuscation rich comment piece at The Hindu, Indian scientists Gyaneshwer Chaubey and Kumarasamy Thangaraj ask whether it's too early to settle the Aryan migration debate. See here.

No, it's not too early. It's game over chaps, and has been for a while.

During the past couple of years ancient DNA has revealed the presence of Y-chromosome haplogroup R1a in Eastern European remains dated to the Mesolithic, Neolithic, Eneolithic and Bronze Age. Moreover, the Bronze Age remains, packed in ancestry derived from Eastern European hunter-gatherers (or EHG) and totally lacking any sort of South Asian admixture, belong to R1a-Z645, which is the ancestral clade of by far the most common types of R1a in Europe and South Asia today: R1a-Z282 and R1a-Z93, respectively. And on top of that, South Asians, especially those speaking Indo-European languages, show significant admixture derived from EHG.

The conclusion from this data is self-evident: during the Bronze Age R1a-Z645 became a very important Y-chromosome lineage in Europe and quickly moved to South Asia, in all likelihood on the back of the Indo-European expansion. Yet, in spite of this, Gyaneshwer and Kumarasamy make the following claim in their article.

Moreover, there is evidence which is consistent with the early presence of several R1a branches in India (our unpublished data).

Potentially powerful stuff, you might say. But hang on, what are Gyaneshwer and Kumarasamy seeing in their data that could possibly reverse the current reality about R1a? Did they find R1a in South Asian remains from the Mesolithic and Neolithic? Or perhaps they've uncovered South Asian Bronze Age remains that belong to R1a-Z645 and lack any signals of ancestry from Eastern Europe?

This is impossible. The ancient DNA from Eastern Europe says so. That's because pre-Indo-European Eastern Europe and South Asia were not the same world; they were world's apart. Thus, you will never read anything like this, no matter how much ancient DNA from South Asia is sequenced:

During the past couple of years ancient DNA has revealed the presence of Y-chromosome haplogroup R1a in South Asian remains dated to the Mesolithic, Neolithic, Eneolithic and Bronze Age. Moreover, the Bronze Age remains, packed in ancestry derived from South Asian hunter-gatherers, and totally lacking any sort of European admixture, belong to R1a-Z645, which is the ancestral clade of by far the most common types of R1a in Europe and South Asia today: R1a-Z282 and R1a-Z93, respectively. And on top of that, Europeans, especially those speaking Indo-European languages, show significant admixture derived from South Asian hunter-gatherers.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

The Out-of-India Theory (OIT) challenge: can we hear a viable argument for once?

Tuesday, July 11, 2017

Working topology for Eurasian population structure

Here's my new "basic" qpGraph topology that I'll be using to test phylogenetic and mixture models for Eurasians. I think it reconciles a few key findings from recent scientific literature. Please note that since my main interest is post-Neolithic prehistory of West Eurasia, and in particular the early Indo-European expansions, I don't want to make this model unnecessarily complex by adding "dead end" Upper Paleolithic genomes.

But I welcome ideas on how to improve and make use of this topology, so if, say, adding Ust_Ishim helps, then let's do it. The ancient samples featured in the above graph are listed here and the graph file is available here. Feel free to post your own versions of the graph file in the comments and I'll run them as soon as possible. But please remember to label the samples correctly at all times.

Update 13/07/2017: Thanks to Matt in the comments, here's a neater version of the same model, with a lower (highest) Z score and slightly different mixture coefficients. It includes a couple of zero edges, which are generally undesirable, but these might disappear when more populations are added to the topology. The graph file is available here.

Monday, July 10, 2017

Armenian confirmation bias

Current Biology recently published a paper by Margaryan and Derenko et al. titled Eight Millennia of Matrilineal Genetic Continuity in the South Caucasus. I wasn't going to bother calling out the authors on their, unfortunately I have to say, rather dubious claim, but then I saw this ScienceNordic article enthusiastically attempting to drive home their misguided point, so a few words are now in order.

“It’s basically the same female population in the region over the past 8,000 years. It’s very surprising considering the many waves of migration and cultural shifts,” says lead-author Ashot Margaryan from the Centre for GeoGenetics at the National History Museum of Denmark, University of Copenhagen.

Genetics have remained constant for 8,000 years in world’s melting pot

I'm at a loss as to why Ashot Margaryan is very surprised. I'm not even mildly surprised. Why? Let's take a closer look at what we're dealing with here:

- the authors sequenced just 52 full mitogenomes to represent 8,000 years of prehistory and early history in the South Caucasus

- they lumped all of these sequences together into an "Ancient" sample set as if they were from a single time slice (I know, pretty crazy)

- they then ran a few complex models on this neither here nor there sample set, and concluded that it resembled the maternal gene pool of present-day Armenians.

Well, duh, present-day Armenians are more or less the end product of the population history of the last eight thousand years in what is now Armenia and surrounds. Is anyone still as surprised about this as Ashot? Surely not.

Obviously, the problem here is that the authors have mistaken their none too surprising outcome to mean that the South Caucasus has not experienced any major upheavals in its maternal gene pool over the past 8,000 years, which, if actually true, would indeed be very surprising, and even shocking.

But the haplogroup assignments of the 52 mitogenomes are reported in the spreadsheet here, and just by eyeballing these results, I can tell that they suggest an influx of foreign ancestry, probably from the Pontic-Caspian steppe, to the South Caucasus after the Early Bronze Age (EBA). Note, for instance, the presence of what appear to be typically steppe haplogroups U4a, U2e1e and U5a1b in the samples dated to the Middle Bronze Age (MBA), Late Bronze Age (LBA) and Early Iron Age (EIA), respectively.


Margaryan and Derenko et al., Eight Millennia of Matrilineal Genetic Continuity in the South Caucasus, Current Biology 27, 1–6 July 10, 2017, DOI: 10.1016/j.cub.2017.05.087

Tuesday, July 4, 2017

Out-of-India chickens coming home to roost

Razib has posted a spacious but none-too-technical review of the ongoing Aryan Invasion Theory (AIT) controversy, along with some personal anecdotes and predictions about how ancient DNA from South Asia might shape the debate in the near future (see here).

It should be a useful guide to the topic for those of you who aren't quite as excited about reading about my latest adventures with qpGraph as many of the regulars in the comments here.

One thing that I'd perhaps add to Razib's post is that the ancient DNA record now boasts Late Neolithic Yamnaya-like Corded Ware Culture individuals from the East Baltic region that belong to Y-haplogroup R1a-Z645. And that's usually as far as their lineages go (see here).

This is important, because the Z645 mutation is directly and recently ancestral to the pair of likely post-Neolithic mutations that define the two R1a subclades most common in Europe and South Asia today: Z282 and Z93, respectively.

So not only are the "European" R1a-Z282 and "South Asian" R1a-Z93 relatively young sister clades, but their ancestral clade has now been found in ancient samples from Northeastern Europe that probably predate their appearance by only a few generations, if that. Of course, the upshot of all of this is that R1a-Z93 could not have originated very far from the East Baltic, which makes South Asia look about as likely as the homeland of this subclade as the goddamn moon. Conversely, it makes AIT look very plausible indeed.

However, granted, this might seem very confusing to anyone who hasn't been studying the R1a topology for years, and perhaps better left out of the more mainstream debates on AIT for the sake of simplicity. By the way, I found this part of Razib's post especially intriguing:

One scientist who holds to the position that most South Asian ancestry dates to the Pleistocene argued to me that we don’t know if ancient Indian samples from the northwest won’t share even more ancestry than the Iranian Neolithic and Pontic steppe samples. In other words, ANI was part of some genetic continuum that extended to the west and north. This is possible, but I do not find it plausible.

I suspect that this scientist's rather fanciful suggestion (which really flies in the face of very solid models based on ancient genomic data from Europe and surrounds) is a hint of the direction that the debate will take right after the publication of ancient genomes from South Asia. Because when that happens, obfuscators like this guy (usually hopeless Out-of-India proponents) will either have to concede defeat and quit the debate, or ramp up their obfuscations to spectacular new highs.

And please don't mistake my confidence on this issue for bluff and bluster. It's not exactly the best kept secret out there that ancient samples from India and Pakistan are now ready, and...oops I probably can't say more than that for now. Pity.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Indian confirmation bias

Europeans: genetically homogeneous on a global scale

From SMBE 2017 via benmpeter on Twitter:

Also at SMBE 2017, David Reich is "sad to leave space of f-statistics", presumably because they don't offer enough resolution when analyzing more recent ancient data from such genetically homogeneous regions as Europe. Via jgschraiber on Twitter.

Update 04/07/2017: A PDF of the Benjamin Peter poster is available at figshare here (30MB).

See also...

SMBE 2017 abstracts

Matters of geography

Monday, July 3, 2017

The Indo-Europeanization of South Asia: migration or invasion?

The recent avalanche of ancient DNA data from across Eastern Europe, including modern-day Bulgaria, Estonia, Latvia, Romania, Ukraine and western Russia, has revealed prehistoric hunter-gatherer populations indigenous to the region harboring a remarkable diversity in Y-chromosome lineages belonging to haplogroups R1, R1a and R1b.

Neolithic transition in the Baltic

Baltic Corded Ware: rich in R1a-Z645

The genetic history of Northern Europe

The genomic history of Southeastern Europe

A few more ancient genomes from the Balkans and Iberia

So the once popular idea that these Y-haplogroups were instead native to Central Asia, the Near East and/or South Asia now looks very wrong.

R1a probably first arrived in South Asia during the Bronze Age with highly mobile Yamnaya-related pastoralists. These people were expanding in almost all directions from the Pontic-Caspian steppe at the time, and it's difficult to imagine that they weren't the ones who first spread Indo-European languages to peninsular Europe and the Indian subcontinent.

It's likely that almost all interested parties will soon agree that this was indeed the case. So the focus in the debate on the expansion of the Indo-Europeans, including Indo-Aryans, into South Asia will soon have to shift from whether it actually happened to how it happened. For instance, was it simply a migration or potentially violent invasion?

I already strongly believe that it was an invasion, or rather a series of invasions. I'll change my mind if, at the end of the day, the evidence says otherwise. But if you favor a migration scenario, then consider these points:

- the population in the northern part of the Indian subcontinent during the Bronze Age, even after the collapse of the Indus Civilization, was likely to have been very large for its time, and yet there was a massive pulse of admixture across South Asia from the steppe and a turnover in Y-chromosomes, especially amongst the ruling classes, suggesting that something very dramatic took place that had a major impact on the social and political fabric of the region

- early Indo-Europeans in the Near East, from the Hittites to the Scythians, are often recorded as warlike and expansionist, with a habit of invading and subjugating other peoples, like the Hattians, Hurrians and Mitanni (who apparently ended up with an Aryan elite)

- if early Indo-Europeans outside of South Asia had a penchant for invasions, then there's no reason to believe that the M.O. of the early Indo-Europeans in South Asia would have been any different, unless some sort of direct empirical evidence says so, but what kind of direct empirical evidence?

Please note, I agree that the suggestion of a potentially violent invasion of South Asia by Indo-Europeans, and, indeed, Aryans, sounds provocative, and will always be politically controversial no matter how much evidence is gathered in its favor. But what if it really happened?

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Indian confirmation bias

Friday, June 30, 2017

At the half-way mark

It's been a huge first six months of the year, with the publication of at least five major ancient DNA preprints and papers (depending on how you define major in this context). Here are the five most popular posts at this blog in 2017 thus far:

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts - over 10,000 hits and counting

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but... - over 10,000 hits

The Bell Beaker Behemoth (Olalde et al. 2017 preprint) - almost 7,000 hits

Latest on Bell Beaker and Corded Ware - almost 6,000 hits

The genomic history of Southeastern Europe (Mathieson et al. 2017 preprint) - almost 6,000 hits

All of these posts are, one way or another, concerned with what ancient DNA says about the expansions of the Proto-Indo-Europeans and/or Indo-Aryans. In other words, the combo of ancient DNA and the Indo-Europeans is what really brings in the crowds here. Conversely, as far as I can tell based on the blog stats, few people nowadays care much about population history papers based solely on present-day DNA.

So what are we to expect in the second half of 2017? Probably quite a lot, including all of that awesome genotype data from the Olalde et al. and Mathieson et al. preprints, as well as a few more ancient DNA preprints and papers. I'm pretty sure that we'll soon see a paper on the origins of the Minoans and Mycenaeans, and another one on the population history of South Asia, with samples from Harappan and Swat culture sites. Somewhere amongst all of that there will also probably be genomes from BMAC and Maikop. Below, a pic of South Asia and surrounds courtesy of NASA.

It's hard to predict what will happen in the comments here when the paper on South Asia comes out. But apparently there are five stages of grief (denial, anger, bargaining, depression and acceptance), and I expect our anti-Kurgan and anti-Aryan Invasion/Migration regulars to go through all five of these stages before they finally accept reality as dictated by the ancient DNA from South Asia. It'll be a hoot whatever happens. So please stay tuned, and remember to behave in the comments.

SMBE 2017 abstracts

The abstract book is available here. Lots of interesting stuff this year, although nothing really earth-shattering as far as I can see, and a couple of the ancient DNA talks are based on preprints that have already appeared at bioRxiv. I'd check out these talks:

Genome wide data from the Iron Age provides insights into the population history of Finland

Lamnidis et al.

Abstract: The population history of Finland is subject of an ongoing debate, in particular with respect to the relationship and origins of modern Finnish and Saami people. Here we analyse genome-wide data, extracted from three teeth found in the archaeological site of Levänluhta, in southern Ostrobothnia. The site dates back to the Iron Age between 550-800 AD, according to the artefacts recovered, while radiocarbon dating on scattered femurs from the site span 350-730 AD. When analysed together with previously published ancient European samples and with modern European populations, the ancient Finnish samples lack a genetic component found in early Neolithic Farmers and all modern European populations today. Instead, we find that they are more closely related to modern Siberian and East Asian populations than modern Finnish are, a pattern also observed in genetic data from modern Saami. Our results suggest that the ancestral Saami population 1500 years ago, inhabited a larger region than today, extending as far south as Levänluhta. Such a scenario is also supported by linguistic evidence suggesting most of Finland to have been speaking Saami languages before 1000 AD. We also observe genetic differences between modern Saami and our ancient samples, which are likely to have arisen due to admixture with Finnish people during the last 1500 years.

40,000-year-old individual from Asia provides insight into early population structure in Eurasia

Yang et al.

Abstract: To date, very few ancient genomic studies have been conducted in Asia. Genome-wide studies using ancient individuals from Europe have revealed complex ancestry and genetic structure in ancient populations that could not be observed studying only present-day populations, suggesting similar approaches may also aid in elucidating the demographic history in Asia. Here, we present genome-wide data for a 40,000-year-old individual from Tianyuan Cave near Beijing, China. We show that he is more related to present-day Asians than present-day and ancient Europeans. However, unlike present-day Asians, he shows potential relationships with some present-day South Americans and a 35,000-year-old European individual. Our results suggest that there was extensive population structure in Asia by 40,000 years ago that persisted over an extended period of time.

Bridging the Divide Between Modern and Ancient DNA

David Reich

Abstract: Genome-wide studies of human variation have for the most part focused either on DNA from present-day individuals, or from individuals who lived prior to 4,000 years ago. However, developing a detailed understanding of how the peoples who lived in the early Bronze Age contributed to Iron Age populations who in turn contributed to Medieval populations who in turn contributed to people living today, has been difficult. One challenge is that by the beginning of the Bronze Age (at least in Western Eurasia where the most ancient DNA data have been collected), the ancestry composition of many populations was very similar to that of populations that live in the same regions today. As a result, the powerful methods that have been developed for learning about population history based on allele frequency correlation patterns are sometimes not able to discern the often subtle differences in ancestry composition between past populations. In this talk, I will describe work in which my colleagues and I have tried to begin to bridge this divide, both by studying ancient samples from intermediate time points, and by deploying more sensitive statistical methods.

See also...

Europeans: genetically homogeneous on a global scale

Thursday, June 29, 2017

52 ancient mitogenomes from the South Caucasus (Margaryan and Derenko et al. 2017)

Over at Current Biology at this link. I'm having a look now to see how this new data compare to ancient mtDNA from the Eneolithic/Bronze Age steppe.

Summary: The South Caucasus, situated between the Black and Caspian Seas, geographically links Europe with the Near East and has served as a crossroad for human migrations for many millennia [1, 2, 3, 4, 5, 6, 7]. Despite a vast archaeological record showing distinct cultural turnovers, the demographic events that shaped the human populations of this region is not known [8, 9]. To shed light on the maternal genetic history of the region, we analyzed the complete mitochondrial genomes of 52 ancient skeletons from present-day Armenia and Artsakh spanning 7,800 years and combined this dataset with 206 mitochondrial genomes of modern Armenians. We also included previously published data of seven neighboring populations (n = 482). Coalescence-based analyses suggest that the population size in this region rapidly increased after the Last Glacial Maximum ca. 18 kya. We find that the lowest genetic distance in this dataset is between modern Armenians and the ancient individuals, as also reflected in both network analyses and discriminant analysis of principal components. We used approximate Bayesian computation to test five different demographic scenarios explaining the formation of the modern Armenian gene pool. Despite well documented cultural shifts in the South Caucasus across this time period, our results strongly favor a genetic continuity model in the maternal gene pool. This has implications for interpreting prehistoric migration dynamics and cultural shifts in this part of the world.

Margaryan and Derenko et al., Eight Millennia of Matrilineal Genetic Continuity in the South Caucasus, Current Biology 27, 1–6 July 10, 2017, DOI: 10.1016/j.cub.2017.05.087

Update 30/06/2017: There are only 10 mitogenomes in this paper that are older than or contemporaneous with the Yamnaya culture of Eastern Europe (Neolithic to Early Bronze Age). But it's pretty clear that the sampled ancient groups could not have contributed maternal ancestry to the Yamnaya people, because most of their mtDNA haplogroups and/or subclades look unusual in the context of the mtDNA diversity of Eneolithic/Bronze Age steppe groups (bolded results below). For more on the controversy surrounding the "southern" ancestry in Yamnaya, see here and here.

H2+152 5900-5600 BC
H15a1 5900-5600 BC
I1 5925-5717 BC
U8b1a1 4486-4320 BC
K3 3000-2800 BC
R1a1 3000-2800 BC
J1b1b1 3000-2800 BC
K3 3039-2864 BC
H14b2 3449-3091 BC
T1 3500-3200 BC

Interestingly, what isn't mentioned in this paper is that the post-Early Bronze Age (EBA) maternal gene pool in the South Caucasus appears to have been influenced by migrations from the Pontic-Caspian steppe. Note, for instance, the presence of haplogroups U4a, U2e1e and U5a1b in the samples dated to the Middle Bronze Age (MBA), Late Bronze Age (LBA) and Early Iron Age (EIA), respectively. These markers look more at home on the steppe than in the South Caucasus.

See also...

Armenian confirmation bias

Strong mitogenomic continuity on the Armenian Plateau since the early Neolithic

Wednesday, June 28, 2017

Iron Age nomads vs Bronze Age herders: Sarmatians and Yamnaya in qpGraph

If we are to take these qpGraph models fairly literally, and I don't see why not, since they're very tight fits overall, then the early Sarmatians from what is now Pokrovka, Russia, derived almost 80% of their ancestry from Yamnaya or a very closely related group, while the rest of their ancestry came from a source that was a ~50/50 mixture between Han-like East Asians and a population closely related to Neolithic and Chalcolithic farmers from what is now Iran.

This topology also tests for the same Iran Neolithic/Chalcolithic-related input in Yamnaya, and I think it's very important to note that the relevant admixture edges (D7 to D9) are 0%, which suggests that Yamnaya did not harbor this type of ancestry. I didn't bother testing for East Asian-related admixture in Yamnaya in the same way, because it never shows such signals in other analyses.

The clearly more complex ancestry of the Sarmatians is probably best explained by the fact that they belonged to a true nomadic warrior culture, and indeed one that managed to spread its influence across vast stretches of Eurasia. So these two Sarmatian individuals, both from Unterlander et al. 2017, may have had recent ancestors from as far afield as Central Asia and Siberia. On the other hand, Yamnaya was a semi-nomadic pastoralist population, and although also highly mobile and prone to long-distance expansions, probably not as mobile as the Sarmatians.

Update 30/06/2017: Interestingly, adding Siberian Upper Paleolithic genome MA1 to the topology in the main model slightly shifts the admixture coefficients for Yamnaya, resulting in an arguably more accurate outcome in which it's modeled as a 50/50 mixture of populations closely related to Eastern European and Caucasus Mesolithic foragers.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Monday, June 26, 2017

Matters of geography

The steppe north of the Black Sea in Ukraine has basically always been considered part of Europe, and just over 100 years ago some guy with a map decided that the steppe between the eastern coast of the Black Sea in Russia and the Ural River in western Kazakhstan should also be Europe.

So nowadays, right or wrong, it's generally accepted that the entire steppe region west of the Ural River, known as the Pontic-Caspian steppe, is in Eastern Europe. Here's a map courtesy of Wikipedia showing how the official boundary between Eastern Europe and Asia has shifted since the 18th century.

But this decision wasn't entirely arbitrary, because the current boundary between Eastern Europe and Asia by and large follows several major geographic barriers, including the Caucasus Mountains, the Caspian Sea and the Ural Mountains. It'd be hard to argue that these barriers haven't had a profound impact across the ages on the character of Europe and its people, and this has probably been known for well over a couple hundred years.

For instance, if we're to trust the most common interpretations of the works of ancient geographers like Hecataeus and Herodotus, then their worlds in some important ways resembled the typical Principal Component Analysis (PCA) of West Eurasian genetic variation. And it seems that they had a pretty good idea where both the strong continental boundaries and fuzzy areas were located.

Below, on the geographic map inspired by Herodotus, Europa or Europe is delineated from much of Asia by the Black Sea, the Caucasus Mountains and the Caspian Sea, while on the genetic map, most European and Asian populations form two, more or less parallel, clusters fairly cleanly separated by empty space (this was first noted in Lazaridis et al. 2013). Indeed, this empty space is the work of the Black Sea, the Caucasus Mountains and the Caspian Sea acting as rather effective barriers to gene flow between Eastern Europe and Asia (see Yunusbayev et al. 2012).

However, on the genetic map, the Iranic Scythians of the Asian steppes straddle my somewhat arbitrary red line separating Europa and Asia, and this is echoed on the Herodotus map by Iranic and related peoples like the Massagetae and Issedones, who inhabit the seemingly undefined part of the world between Europa and Asia east of the Caspian Sea (Mare Caspium).

Nothing really ground breaking, but pretty cool stuff.

On a related note, I've seen the term "mainland Europe" used recently in at least one of the big ancient DNA papers to describe the part of Europe west of the Pontic-Caspian steppe. It seems that the authors wanted to underline the fairly stark genetic difference that existed between most of Europe and the steppe just prior to the expansion of Yamnaya and related steppe herder groups that initiated the formation of the present-day European gene pool.

I can see why they did this, but to my mind they got things backwards. That's because the term mainland implies the opposite of island and/or peninsula, and of course the part of Europe west of the Pontic-Caspian steppe is a relatively narrow strip of land surrounded by water, so it's a peninsula. Let's visualize these two models on a map of Europe courtesy of Wikipedia:

I understand that my model might result in heart palpitations for some readers, especially those from Western Europe, who generally see their part of Europe as core Europe, but I feel that it makes good sense from a purely geographic POV.

See also...

Europeans: genetically homogeneous on a global scale

Monday, June 19, 2017

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

It's now more than obvious that South Asia experienced an almighty pulse of admixture from an Early Bronze Age (EBA) population originally from somewhere on the Pontic-Caspian steppe in Eastern Europe. This is fairly easy to demonstrate thanks to ancient DNA from Europe and West Asia. One way of doing it is with the qpGraph algorithm.

Moreover, the widespread presence of Y-chromosome haplogroup R1a in South Asia is, at least in large part, linked to this event, because:

- Mesolithic Eastern European foragers belonging to basal clades of R1a do not show any South Asian or even Near Eastern ancestry, so it's likely that R1a is native to Eastern Europe and surrounds

- If R1a is native to Eastern Europe then it can't also be native to South Asia, which is not only thousands of miles away, but also ecologically a different world

- The most common R1a subclades in the world today, R1a-M417 and one of its main daughter branches R1a-Z93, appear in Late Neolithic and Bronze Age European pastoralist groups (Corded Ware, Srubnaya and closely related peoples) that harbor high levels of Eastern European forager ancestry and no signs of South Asian admixture

- Practically 100% of the R1a in South Asia today belongs to the R1a-Z93 subclade, which, based on full Y-chromosome sequencing data, looks like it began expanding rapidly only during the EBA, eventually making its way to South Asia, and this is in line with the available ancient DNA evidence

- In South Asia, R1a and ancient steppe admixture peak in groups that speak Indo-European, including Indo-Aryan, languages, suggesting that both are genetic signals of the Indo-European expansions into the Indian subcontinent

So we're now at a stage where anyone with at least moderate thinking capacity, whose mind isn't poisoned by extreme bias, has to agree that there was a rather large movement of people from the Eurasian steppes into South Asia during the Bronze Age. No ifs or buts.

Ancient DNA from South Asia is on the way. It might throw up a few surprises and force a new model of how the Indo-Europeans and R1a got to South Asia, but it won't turn things upside down. In other words, don't expect the Out-of-India or "indigenous Aryans" theory to suddenly come into the picture as a viable alternative to the Aryan Invasion Theory (AIT), occasionally presented as the more politically correct Aryan Migration Theory (AMT).

Many Indians still don't get this, or rather they refuse to get it, which is very frustrating, especially if you're a regular in the comments section here. But admittedly it can also be very entertaining.

Last week The Hindu published an interesting piece on the latest developments in South Asian population genetics that were making the AIT, or at least AMT, look like a sure thing:

How genetics is settling the Aryan migration debate

Soon after came this peculiarly titled retort in the Swarajya online magazine, in which unfortunately it's impossible to find a single coherent argument:

Genetics Might Be Settling The Aryan Migration Debate, But Not How Left-Liberals Believe

Generally hilarious stuff, except the parts where the author abuses blogger Razib Khan for moving with the latest genetic data and arguing in favor of the Aryan expansion into India (see here and here).

So what are we to expect when the first big paper with ancient DNA from South Asia comes out, probably in the next few months? For starters, accusations of racism and maybe even hate speech against anyone who claims that the results support the AIT or AMT, or anything even close. And lots of shouting and carrying on. But also a lot more comic relief.

See also...

The Out-of-India Theory (OIT) challenge: can we hear a viable argument for once?

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Friday, June 16, 2017

Cypriot Y-chromosomes (Heraclides et al. 2017)

Over at PLoS ONE at this link. Note the fairly high levels of Y-haplogroups R1a and/or R1b in many of the Greek and Turkish populations in the figure below. Much of this might be of fairly recent European (mostly Slavic) and Central Asian (Turkic nomad and Ottoman) provenance, but I'd say some of it has to date back to the Bronze Age, and potentially to the expansions of the Proto-Anatolians, Proto-Armenians and Proto-Greeks into the Balkans and Anatolia from the Pontic-Caspian steppe. Emphasis is mine:

Abstract: Genetics can provide invaluable information on the ancestry of the current inhabitants of Cyprus. A Y-chromosome analysis was performed to (i) determine paternal ancestry among the Greek Cypriot (GCy) community in the context of the Central and Eastern Mediterranean and the Near East; and (ii) identify genetic similarities and differences between Greek Cypriots (GCy) and Turkish Cypriots (TCy). Our haplotype-based analysis has revealed that GCy and TCy patrilineages derive primarily from a single gene pool and show very close genetic affinity (low genetic differentiation) to Calabrian Italian and Lebanese patrilineages. In terms of more recent (past millennium) ancestry, as indicated by Y-haplotype sharing, GCy and TCy share much more haplotypes between them than with any surrounding population (7–8% of total haplotypes shared), while TCy also share around 3% of haplotypes with mainland Turks, and to a lesser extent with North Africans. In terms of Y-haplogroup frequencies, again GCy and TCy show very similar distributions, with the predominant haplogroups in both being J2a-M410, E-M78, and G2-P287. Overall, GCy also have a similar Y-haplogroup distribution to non-Turkic Anatolian and Southwest Caucasian populations, as well as Cretan Greeks. TCy show a slight shift towards Turkish populations, due to the presence of Eastern Eurasian (some of which of possible Ottoman origin) Y-haplogroups. Overall, the Y-chromosome analysis performed, using both Y-STR haplotype and binary Y-haplogroup data puts Cypriot in the middle of a genetic continuum stretching from the Levant to Southeast Europe and reveals that despite some differences in haplotype sharing and haplogroup structure, Greek Cypriots and Turkish Cypriots share primarily a common pre-Ottoman paternal ancestry.


Y-haplogroup frequencies within GCy and TCy can be found in S6 Table. Y-haplogroup frequencies of Cypriots, Greeks, and Turks, as well as other surrounding populations can be found in Fig 1 (as well as S7 Table). GCy and TCy showed very similar frequencies for the major Y-haplogroups, differentiating both from Greek and Turkish sub-populations (Fig 3). The most frequent major Y-haplogroup subclade in both GCy and TCy was J2a-M410 (23.8% and 20.3% among GCy and TCy, respectively), followed by E-M78 (12.8% Vs 13.9%) and G2-P287 (12.5% Vs13.7%). R1b-M343 was found in higher frequency among GCy (11.9%) than TCy (6.8%), while the same applies for E-M123 (13.1% Vs 6.3%). Finally, haplogroup, although in much lower frequencies than the aforementioned haplogroups, haplogroup I2 was somewhat higher among TCy (6.8%), than among GCy (2.3%), while haplogroup J2b was higher among GCy (5.8%) than TCy (1.8%). Other, less common haplogroups (i.e. I1, R1a, L, and T) showed similar frequencies (in the range of 1–5%) between GCy and TCy.

One additional difference between GCy and TCy was the presence of moderate numbers of East Eurasian (primarily Central Asian) Y-haplogroups and small numbers of North African Y-haplogroups among TCy but not among GCy. The frequency of East Eurasian haplogroups among TCy was C-M130 (0.5%), H-L901 (0.3%), N-M231 (2.4%), O-M175 (0.8%) and Q-M242 (1.3%), reaching a total of 5.6%, but only totalling 0.6% among GCy. North African haplogroups (E-M81, E-V38) were only found among TCy (2.1%) (S6 and S7 Figs).

A major feature differentiating Cypriots from Greeks, is the much lower frequency of haplogroups I (2.9% GCy, 7.3% TCy, ~10–21% mainland Greeks) and R1a (2.9% GCy, 3.2% TCy, ~10–22% mainland Greeks) among the former. All differences in haplogroup frequencies between populations were statistically significant (Fisher’s Exact test, p<0.001).


In terms of Y-haplogroup distribution, Cypriots (GCy and TCy) show substantial differences from Greeks, characterized by much lower frequency of haplogroups I2, R1a, and R1b in the former. These haplogroup differences indicate differential migrations into Cyprus and mainland Greece, at different points in history and prehistory. I2 is considered the major haplogroup among Mesolithic European Hunter-Gatherers[60], who apparently were either absent from Cyprus or were totally diluted (nearly extinguished) by subsequent migrations. Although the exact origins and migratory patterns of R1a and R1b are still under rigorous investigation, it seems that they are linked to Bronze Age migrations from the Western Eurasian Steppe and Eastern Europe into Southern (including Greece) and Western Europe[61]. Apparently, such migrations (especially as regards R1a) into Cyprus were limited.

Additionally, the Greek population has received considerable migrations during the Byzantine era and the Middle Ages from other Balkanic populations, such as Slavs[62,63], Aromanians (Vlachs)[64], and Albanians (Arvanites)[65,66]. The former, is very likely to have increased R1a frequencies among Greeks. In fact, Fig 3 (also S7 Table) indicate that R1a increases gradually with increasing latitude in Greece. There is no historical evidence for such migrations into Cyprus during the same period.

Heraclides A, Bashiardes E, Fernández-Domínguez E, Bertoncini S, Chimonas M, Christofi V, et al. (2017) Y-chromosomal analysis of Greek Cypriots reveals a primarily common pre-Ottoman paternal ancestry with Turkish Cypriots. PLoS ONE 12(6): e0179474.

Thursday, June 8, 2017

qpGraph open thread

I managed to put together a simple qpGraph model for the Kalash using present-day populations. It's largely based on the model for the Paniya by Nakatsuka et al. (see Supplementary Figure 5. here). The graph and pops files for my model can be downloaded here and here, respectively. I'm now working on a more complex model for the Kalash that includes ancient genomes from Eastern Europe and West Asia.

I'm willing take a few requests for qpGraph models in the comments below. Please note, however, that these requests will have to be accompanied by graph and pops files, and the graph files must be correctly set out; if they don't work, then they don't work, and you won't get your graph. On the other hand, you only need to supply pops files with the correct populations and I'll do the rest.

See also...

Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts

Wednesday, June 7, 2017

The pigtailed figures

Reconstructed Proto-Indo-European (PIE) vocabulary suggests that the speakers of PIE, who probably lived on the Pontic-Caspian steppe during the Eneolithic, were familiar with wool. Interestingly, ancient DNA suggests that Near Eastern-related ancestry first appeared on the Pontic-Caspian steppe during the Eneolithic, because Neolithic samples from the Pontic steppe in what is now Ukraine lack this type of admixture. Perhaps it first arrived there with women from south of the Caucasus who knew how to spin wool? Below are a couple of interesting quotes from Becker et al. 2016. Emphasis is mine:

For ancient Mesopotamia McCorriston has proposed a fundamental shift from linen-based to woollen textile production. [4] Drawing on evidence from cuneiform texts as well as faunal and botanical remains, she suggests that it was in the 3rd or perhaps late 4th millennium BCE that wool became the fibre of choice for everyday use. Recent archaeological and archaeozoological research, however, suggests a considerably earlier date, before the advent of writing. Written sources from the mid- to late 3rd millennium BCE demonstrate that sheep and goats were maintained in herds of some dozens to a few hundred and herded in large flocks up to several thousand animals. In fact, cuneiform records provide ample evidence for the usage of wool in textile manufacture, whereas linen appears only rarely. The growth of a large-scale woollen textile industry rested on women as the main source of labour.


During the Late Uruk and Jemdat Nasr periods in Mesopotamia, scenes appear on cylinder seals that have been interpreted as showing textile production carried out by so-called pigtailed figures. [93] A specific raw material cannot be deduced from these depictions, but the substantial number of scenes indicates a significant concern with cloth manufacture.

Becker et al., The Textile Revolution. Research into the Origin and Spread of Wool Production between the Near East and Central Europe, eTopoi, Special Volume (6) 2016, (ISSN 2192-2608)

See also...

A plausible model for the formation of the Yamnaya genotype

A homeland, but not the homeland

Monday, June 5, 2017

Ancient human genomes from Southern Africa (Schlebusch et al. 2017 preprint)

Over at bioRxiv at this LINK. Emphasis is mine:

Abstract: Southern Africa is consistently placed as one of the potential regions for the evolution of Homo sapiens. To examine the region's human prehistory prior to the arrival of migrants from East and West Africa or Eurasia in the last 1,700 years, we generated and analyzed genome sequence data from seven ancient individuals from KwaZulu-Natal, South Africa. Three Stone Age hunter-gatherers date to ~2,000 years ago, and we show that they were related to current-day southern San groups such as the Karretjie People. Four Iron Age farmers (300-500 years old) have genetic signatures similar to present day Bantu-speakers. The genome sequence (13x coverage) of a juvenile boy from Ballito Bay, who lived ~2,000 years ago, demonstrates that southern African Stone Age hunter-gatherers were not impacted by recent admixture; however, we estimate that all modern-day Khoekhoe and San groups have been influenced by 9-22% genetic admixture from East African/Eurasian pastoralist groups arriving >1,000 years ago, including the Ju|'hoansi San, previously thought to have very low levels of admixture. Using traditional and new approaches, we estimate the population divergence time between the Ballito Bay boy and other groups to beyond 260,000 years ago. These estimates dramatically increases the deepest divergence amongst modern humans, coincide with the onset of the Middle Stone Age in sub-Saharan Africa, and coincide with anatomical developments of archaic humans into modern humans as represented in the local fossil record. Cumulatively, cross-disciplinary records increasingly point to southern Africa as a potential (not necessarily exclusive) 'hot spot' for the evolution of our species.

Schlebusch et al., Ancient genomes from southern Africa pushes modern human divergence beyond 260,000 years ago, bioRxiv, Posted June 5, 2017, doi:

Friday, June 2, 2017

The healthy Kurgan pastoralist

Just in at bioRxiv, a new preprint on the genomic health of ancient hominins, at this LINK. Obviously, if it's true that the Yamnaya and other closely related Kurgan culture pastoralists of the ancient Eurasian steppe had unusually healthy genomes, then it becomes easier to understand why they made such a massive impact on the ancestry of present-day Europeans and Central and South Asians, because populations that enjoy good health are likely to grow faster than those that don't. From the preprint, emphasis is mine:

Abstract: The genomes of ancient humans, Neandertals, and Denisovans contain many alleles that influence disease risks. Using genotypes at 3180 disease-associated loci, we estimated the disease burden of 147 ancient genomes. After correcting for missing data, genetic risk scores were generated for nine disease categories and the set of all combined diseases. These genetic risk scores were used to examine the effects of different types of subsistence, geography, and sample age on the number of risk alleles in each ancient genome. On a broad scale, hereditary disease risks are similar for ancient hominins and modern-day humans, and the GRS percentiles of ancient individuals span the full range of what is observed in present day individuals. In addition, there is evidence that ancient pastoralists may have had healthier genomes than hunter-gatherers and agriculturalists. We also observed a temporal trend whereby genomes from the recent past are more likely to be healthier than genomes from the deep past. This calls into question the idea that modern lifestyles have caused genetic load to increase over time. Focusing on individual genomes, we find that the overall genomic health of the Altai Neandertal is worse than 97% of present day humans and that Otzi the Tyrolean Iceman had a genetic predisposition to gastrointestinal and cardiovascular diseases. As demonstrated by this work, ancient genomes afford us new opportunities to diagnose past human health, which has previously been limited by the quality and completeness of remains.


Both the allergy/autoimmune and gastrointestinal/liver disease categories (which share many of the same disease-associated loci) show significantly lower genetic risk in pastoralists than agriculturalists and hunter gatherers. Pastoralists also have significantly reduced risk for cancer compared to agriculturalists. Agriculturalists have a higher genetic risk for dental/periodontal diseases than hunter-gatherers and pastoralists. In general, pastoralists possess extremely healthy genomes, especially for cancers and immune-related, periodontal, and gastrointestinal diseases.


It is unclear why pastoralists would have the lowest risk in these specific disease categories. We caution that this pattern may be the result of technical issues, as pastoralists have the smallest sample size (only 19 individuals) and geographic range (between 40-90°E longitude and 45-55°N latitude, Figure 1B). Because populations that have different subsistence types also differ in other ways, the lower GRS of pastoral populations may be due to other factors, including demographic history.

Ali J. Berens, Taylor L. Cooper, Joseph Lachance, The Genomic Health Of Ancient Hominins, bioRxiv, Posted June 2, 2017, doi:

Wednesday, May 31, 2017

A homeland, but not the homeland

It seems increasingly likely that ancient DNA has identified a massive expansion, or a series of expansions, from Mesopotamia and/or surrounds in basically all directions dating to the Chalcolithic (ChL) and Bronze Age (BA). This phenomenon is mainly characterized by the simultaneous spread of:
- Iran_ChL-related genome-wide ancestry

- Y-haplogroup J

- South Caspian-specific mitochondrial haplogroups such as R2 and U7

At least two of these characteristics are shared by five groups that have appeared in the Near Eastern and African ancient DNA record as probable post-Neolithic newcomers, at least in part, at their respective sampling sites:

- Anatolia_BA, Western Turkey, 2836-1800 calBCE (Lazaridis et al. 2017)

- Egyptian mummies, Middle Egypt, 776-2 calBCE (Schuenemann et al. 2017)

- Iran_ChL, Western Iran, 4839-3796 calBCE (Lazaridis et al. 2016)

- Levant_BA, Northwestern Jordan, 2489-1966 calBCE (Lazaridis et al. 2016)

- Sidon_BA, Southern Lebanon, 1750-1600 BCE (Haber et al. 2017)

I'm confident that many more such groups will soon be added to the ancient DNA record, probably including Levant_ChL from the upcoming Harney et al. 2017 (a teaser of the paper can be seen here). Below, a map of Mesopotamia courtesy of Wikipedia.

It's an interesting and important question who these likely Mesopotamian migrants and their descendants were in terms of linguistic affinities. It seems that they left a massive genetic imprint on the Near East and much of North Africa, and perhaps also Central Asia and Southeastern Europe, so they probably also left some sort of linguistic legacy.

Obviously, it's highly improbable that most of them were Indo-European speakers. So if most of them weren't Indo-Europeans, then the phenomenon I'm describing here can't be related to the Proto-Indo-European (PIE) expansion. Forget the idea of an West Asian linguistic hot spot spewing out different, distantly related language families, including Indo-European, via the migrations of closely related Iran_ChL-like populations over a span of a few thousand years; it's plain stupid.

So who were they?

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Tuesday, May 30, 2017

Ancient Egyptians less Sub-Saharan than present-day Egyptians

Over at Nature Communications at this LINK. Emphasis is mine:

Abstract: Egypt, located on the isthmus of Africa, is an ideal region to study historical population dynamics due to its geographic location and documented interactions with ancient civilizations in Africa, Asia and Europe. Particularly, in the first millennium BCE Egypt endured foreign domination leading to growing numbers of foreigners living within its borders possibly contributing genetically to the local population. Here we present 90 mitochondrial genomes as well as genome-wide data sets from three individuals obtained from Egyptian mummies. The samples recovered from Middle Egypt span around 1,300 years of ancient Egyptian history from the New Kingdom to the Roman Period. Our analyses reveal that ancient Egyptians shared more ancestry with Near Easterners than present-day Egyptians, who received additional sub-Saharan admixture in more recent times. This analysis establishes ancient Egyptian mummies as a genetic source to study ancient human history and offers the perspective of deciphering Egypt’s past at a genome-wide level.

Schuenemann et al., Ancient Egyptian mummy genomes suggest an increase of Sub-Saharan African ancestry in post-Roman periods, Nature Communications 8, Article number: 15694 (2017), doi:10.1038/ncomms15694

See also...

A homeland, but not the homeland

Friday, May 26, 2017

Canaanite genomes (Haber et al. 2017 preprint)

Over at bioRxiv at this LINK:

Abstract: The Canaanites inhabited the Levant region during the Bronze Age and established a culture which became influential in the Near East and beyond. However, the Canaanites, unlike most other ancient Near Easterners of this period, left few surviving textual records and thus their origin and relationship to ancient and present-day populations remain unclear. In this study, we sequenced five whole-genomes from ~3,700-year-old individuals from the city of Sidon, a major Canaanite city-state on the Eastern Mediterranean coast. We also sequenced the genomes of 99 individuals from present-day Lebanon to catalogue modern Levantine genetic diversity. We find that a Bronze Age Canaanite-related ancestry was widespread in the region, shared among urban populations inhabiting the coast (Sidon) and inland populations (Jordan) who likely lived in farming societies or were pastoral nomads. This Canaanite-related ancestry derived from mixture between local Neolithic populations and eastern migrants genetically related to Chalcolithic Iranians. We estimate, using linkage-disequilibrium decay patterns, that admixture occurred 6,600-3,550 years ago, coinciding with massive population movements in the mid-Holocene triggered by aridification ~4,200 years ago. We show that present-day Lebanese derive most of their ancestry from a Canaanite-related population, which therefore implies substantial genetic continuity in the Levant since at least the Bronze Age. In addition, we find Eurasian ancestry in the Lebanese not present in Bronze Age or earlier Levantines. We estimate this Eurasian ancestry arrived in the Levant around 3,750-2,170 years ago during a period of successive conquests by distant populations such as the Persians and Macedonians.


However, the present-day Lebanese, in addition to their Levant_N and Iranian ancestry, have a component (11-22%) related to EHG and Steppe populations not found in Bronze Age populations (Figure 3A). We confirm the presence of this ancestry in the Lebanese by testing f4(Sidon_BA, Lebanese; Ancient Eurasian, Chimpanzee) and find that Eurasian hunter-gatherers and Steppe populations share more alleles with the Lebanese than with Sidon_BA (Figure 3B). We next tested a model of the present-day Lebanese as a mixture of Sidon_BA and any other ancient Eurasian population using qpAdm. We found that the Lebanese can be best modelled as Sidon_BA 93±1.6% and a Steppe Bronze Age population 7±1.6% (Figure 3C; Table S6).

Haber et al., Continuity and admixture in the last five millennia of Levantine history from ancient Canaanite and present-day Lebanese genome sequences, bioRxiv, Posted May 26, 2017, doi:

See also...

Yamnaya-related ancestry proportions in Europe and west Asia

A homeland, but not the homeland

Thursday, May 25, 2017

A few more ancient genomes from the Balkans and Iberia

Open access at Current Biology:

Our results show major Western hunter-gatherer (WHG) ancestry in a Romanian Eneolithic sample [GB1_Eneo] with a minor, but sizeable, contribution from Anatolian farmers, suggesting multiple admixture events between hunter-gatherers and farmers.

González-Fortes et al., Paleogenomic Evidence for Multi-generational Mixing between Neolithic Farmers and Mesolithic Hunter-Gatherers in the Lower Danube Basin, Current Biology, Published Online: May 25, 2017, DOI:

See also...

The genomic history of Southeastern Europe (Mathieson et al. 2017 preprint)

Anywhere but the steppe

Last week Scientific Reports put out a paper by Sarno et al. on the population history of Sicily and South Italy. I didn't blog about it at the time because I felt that it was generally a weak effort and not worth advertising. But people keep bringing it up in the comments section, so here goes.

If you download the PDF and do a search for "Africa", you'll see that the only time it comes up is in the bibliography. "Maghreb" doesn't come up at all.

Can anyone explain this? I can't. If you're doing a paper on the population history of Sicily and South Italy and you don't take a close look at the fairly recent North African admixture there, then at best you're naive and confused.

Also, the authors try to enter the Proto-Indo-European (PIE) homeland debate. They basically argue that Indo-European (IE) languages could not have arrived in Southeastern Europe from the Pontic-Caspian steppe because modern-day Southeastern Europeans overall don't pack much Bronze Age steppe admixture. They also claim that based on their admixture dating efforts (which may or may not be accurate) the steppe ancestry by and large arrived in the east Mediterranean during the early Middle Ages with Slavic migrations. Thus, they suggest that a better PIE homeland alternative to the Pontic-Caspian steppe might be West Asia.

These are very weak arguments for a number of important reasons. For instance, language change can happen without massive migrations from afar. Case in point: the Etruscans were a sizable non-IE speaking population in Southeastern Europe until historic times, and discarded their Etruscan language in favor of the IE Latin by being subsumed into the Roman Empire. Indeed, Southeastern Europe has been a bit of a hotspot for this type of thing; Razib has a little more on that and the admixture dating here.

Also worth positing is the likely scenario in which much of the Bronze Age steppe ancestry in Southeastern Europe has been diluted by more recent admixture from the Near East and North Africa. It's hard to say for sure to what extent without direct evidence from ancient DNA, but this is something that should have been considered in the paper.

I won't be blogging much from now on about population history papers based on modern-day samples, because such papers aren't usually worth blogging about.


Sarno et al., Ancient and recent admixture layers in Sicily and Southern Italy trace multiple migration routes along the Mediterranean, Scientific Reports 7, Article number: 1984, (2017), doi:10.1038/s41598-017-01802-4

See also...

Late PIE ground zero now obvious; location of PIE homeland still uncertain, but...

Sunday, May 21, 2017

Steppe invaders in the Bronze Age Balkans

In a recent blog post announcing the end of the search for the Late Proto-Indo-European (PIE) homeland I wrote this:

But of course I2a has also been recorded in prehistoric samples from the Pontic-Caspian steppe. So, you might ask, why did the populations migrating out of the steppe belong to R1a and R1b, and why did some of them seemingly carry only R1a while others only R1b? This can be explained by local founder effects on the steppe due to patrilocality. Moreover, it's possible that some groups moving out of the steppe did carry high frequencies of I2a, but they're yet to enter the ancient DNA record.

Actually, in hindsight, such a population has probably already shown up in the ancient DNA record, via two Early Bronze Age (EBA) individuals from the Balkans in the Mathieson et al. 2017 preprint:

Balkans_BronzeAge I2165: Y-hg I2a2a1b1b mt-hg T2f 3020-2895 calBCE

Yamnaya_Bulgaria Bul4: Y-hg I2a2a1b1b mt-hg ? 3012-2900 calBCE

Both samples are from burial sites in present-day South-Central Bulgaria. Apart from sharing I2a2a1b1b, they each pack a fair bit of Yamnaya-related ancestry and are dated to a very similar time period. Unlike Bul4, I2165 does not make the cut archaeologically as a Yamnaya sample, but he does come from a Tumulus (Kurgan-like) burial, so perhaps he's from a group influenced by Yamnaya?

By the way, the I2a2a1b1b lineage is also shared by Yamnaya_Kalmykia RISE552, and as far as I can tell, the oldest individual sampled to date belonging to this line is Ukraine_Neolithic I1738, dated to 5473-5326 calBCE. So I2a2a1b1b appears to be a Pontic-Caspian steppe marker.

The same paper also includes the following individual from present-day Bulgaria dated to the start of the Late Bronze Age (LBA), which is roughly when the Mycenaeans appeared nearby in what is now Greece:

Bulgaria_MLBA I2163: Y-hg R1a1a1b2 mt-hg U5a2 1750-1625 calBCE

This guy is the most Yamnaya-like of all of the Balkan samples in Mathieson et al. 2017, and, as far as I can see based on his overall genome-wide results, probably indistinguishable from the contemporaneous Srubnaya people of the Pontic-Caspian steppe. He also belongs to Y-haplogroup R1a-Z93, which is a marker typical of Srubnaya and other closely related steppe groups such as Andronovo, Potapovka and Sintashta. So there's very little doubt that he's either a migrant or a recent descendant of migrants to the Balkans from the Pontic-Caspian steppe.

The presence of multiple individuals like this in the still rather spotty Balkan Bronze Age ancient DNA record suggests that this part of Europe experienced sustained and possibly at times large scale incursions of various peoples from the Pontic-Caspian steppe throughout the Bronze Age.

Here's one of the Principal Component Analyses (PCA) plots from Mathieson et al. 2017, edited by me to highlight the above mentioned three samples, as well as the anything but weak impact of gene flow from the Pontic-Casian steppe on the Balkans during the Bronze Age. Just in case some of you are confused, I added an arrow pointing to the cluster that most of the Balkan Bronze Age samples are pulling towards.

Of course, many of us are now eagerly awaiting a paper on the genetic origins of the Minoans and Mycenaeans. The latter are one of the few attested Indo-European speakers from prehistory, so their genetic structure may prove pivotal in the Indo-European homeland debate.

I know for a fact that a couple of ancient DNA labs have been working on such a paper for a while now, but I haven't heard anything about the results. However, just looking at the PCA above, I'd be shocked if the Mycenaean samples did not show a strong signal of gene flow from the Pontic-Caspian steppe. If so, the implications of this will be obvious.


Mathieson et al., The Genomic History Of Southeastern Europe, bioRxiv, Posted May 9, 2017, doi: