
Originally Posted by
Shaikorth
That's true for other things like ADMIXTURE clustering as well though, the components very much depend on the number and type of samples in the run. A Sintashta-Caucasian-Dai mix based on the population averages of Eurogenes K13 or even the low-K ancient runs might not look like a Pashtun average, but that doesn't take into account the overlap that exists within ADMIXTURE clusters. The qpAdm fit is based on f4 ratios and I'd trust it over any ADMIXTURE or PCA results, but if either correlates with it that's good. Indeed an Irish-Chinese mix would not have the same best qpAdm fit as Uygurs or Hazaras.
The reason why I think it would be hard to improve on those qpAdm fits is that I trust the Haak et al method and qpAdm replicates it well. The S-C Asian fits are very good, much better than what was achieved with the ancient genomes in Haak et al dataset for them, and compare favourably to European fits:
Belarusian qpAdm fit (good for an European population)
Chisq 0,641 tail prob 0,887085 Yamnaya 0,474 EN 0,358 WHG 0,143 E-Asian 0,025
Best Haak et al Belarusian fit: Yamnaya 0.447 EN 0.388 WHG 0,135 Nganasan 0,03
It is of course true that the fit just reflects ancient ancestry and does not exclude the possibility that some of Pashtun Sintastha-like ancestry might have arrived before or after any Sintashta admixture, and same goes for Europe. But for the base components I see really no reason to doubt the results.