    In order for qpadm to work properly, we may need ancient dna samples only as input. Dai for example may not be good enough. People have worked hard to calculate ancestry compositions in South Asia, but I think we need to remember one famous abbreviation in computer science: GIGO.

    Paternal - Y-DNA: J2b2* (J-M241) Z2432+ Z2433+ Y978+ (J2b2a2b1*) (Hidden Content ) (YFull: YF02959) (FTDNA Kit B6225), mtDNA: M18a* (FTDNA Kit 329180) (YFull: YF63773)
    Maternal- Y-DNA: R1a1a1b2a1a2c2d5a* L657+ Y7+ (R-Y16494*) (FTDNA Kit 311047) (YFull: YF68408), mtDNA: Hidden Content (FTDNA Kit B6225) (YFull: YF02959) (Mother's Mother's Father: R1a1a1b2a1a2c2* Y7+ Y29+ (R-Y29*) (FTDNA Kit 329181) (YFull: YF65256))

    Here I have combined information obtained from qpAdm and Davidski's latest K8 PCA to model Pashtuns using 2 assumed input populations from qpAdm, Sintashta, and BA Armenians (simulated by Caucasus until Davidski is able to plot BA Armenians on the PCA), and the output population, Pashtun.

    Pashtun is shown as the centroid of the triangle, which is defined as the arithmetic mean (center of mass) of the input 3 populations represented by the vertices on the triangle.

    With the known quantities, 2 input populations, and output population, the position of the 3rd mystery population, Pop X, is easily calculated by using the medians and trigonometry . Pop X is further determined by projecting additional populations onto the PCA.

    The model shown above calculates as :

    35% Sintashta + 45% Caucasus + 20% Pop X to be determined by David based on its position on the PCA


    Edit: This analysis assumes that PC1/PC2 is able to capture the vast majority of the variation in the 8 admixture components.
    Quote Originally Posted by DMXX View Post
    Going through each of these points individually:

    See later for a detailed response regarding why the Bronze Age Armenians constitute around 70-80% of Iranian genetic ancestry.

    Once more, not taking into consideration the possibility(/probability?) that ASI could be a hybrid component and not simply "South Asian-specific ENA".

    Do you have any actual evidence that would, beyond any reasonable doubt, push observers to conclude ASI is merely "South Asian-specific ENA" and is not also a hybrid construct the same way ANI is?

    At least one other regular peruser of autosomal genetic profiles in the genetic genealogy community acknowledges this could well be the case.

    Do you have any empirically based insights that dismiss this idea?

    Uniparental markers by themselves are not informative given their liability towards founder effects. This is very relevant in the case of Pashtuns and not a recognised issue with Iranians, as the former still abide by a tribal structure.

    Thus, asserting anything resembling a one-to-one correlation (with this frequent assertive language, highlighted) between modern Pashtuns and an (undeniably) ancestral population via uniparental markers and statistical fits runs against conventional wisdom in population genetics.

    Actually, it's the first I've seen such an audacious comparison in years.

    Given nobody so far has invoked "spatio-temporal nodes" or contrasting models, this is irrelevant. I see no overt "befuddlement", or attempt to marry these findings with prior models in this specific discussion. The three-way model applicable to Europeans is clearly not applicable to Asian populations (although we can safely categorise the steppe groups as very European-like). This is a false equivalence and a distraction from the point below.

    Instead, numerous observers (both on this forum and elsewhere) have noticed the rather marked differences in ADMIXTURE component scores between the Eurasian steppe groups with contemporary populations in South-Central Asia. Shaikorth's explanation regarding hidden component scores will surely account for some of the observed disparity, but whether it accounts for all is a quandary which literal interpretors of these qpAdm statistics will have to satisfactorily explain.

    The onus is not on those with a non-literal interpretation (i.e. myself; my own view in case people missed this; significant contribution to Afghans, less so for Iranians and Kurds).

    Repetition of prior point(s) (argumentum ad nauseam).

    Nobody is contesting the chisq values point towards an excellent statistical fit.

    An excellent statistical fit, however, does not (and never will or should) result in wholesale reassessment of ancestral origins in the manner that has been undertaken recently.

    qpAdm does look like a superior tool to these. However, qpAdm alone is not a perfect program. I have previously stated some of the overt issues with this tool (e.g. output completely reliant on input populations).

    In case some appeal to authority is required to reduce the dependency on re-imagining the ethnogenesis of various Eurasian populations through this tool, some comments from the qpAdm authors regarding its' applicability (their readme file):

    Directly fitting the context of this discussion into these two highlighted points:
    1) The genetic background of posterior populations in the timeline residing in South-Central Asia are absolutely essential if we are to arrive at any proportions that approach reality. Hypothetical speculations aside, it's abundantly clear that Pashtuns cannot be a three-way mix between Sintashta, Georgians and Dai. The strength of a statistical fit simply does not override either the archaeological, linguistic or (pre)historical narrative.

    2) Conversely, there is no satisfactory reason (only special pleading) to assume that Pashtuns have remained genetically static since the hypothetical admixture juncture indicated by qpAdm. Once more, the strength of a statistical fit simply does not override either the archaeological, linguistic or (pre)historical narrative.

    Please read above.

    Neolithic remains alone will not yield a definitive answer to the accuracy of these results. Data from both Mesolithic and Iron Age South-Central Asia is also required.

    Several comments above, you raised the red herring argument of (unspecified) people being "befuddled" with "old models".

    Here, you appear to favourably cite a prior correspondence with everest59, despite his very objection now to the literal interpretation of this "Pashtuns are predominantly Sintashta" notion. Reconsider using this as anything resembling evidence.

    Given everest59's recent comments disagree with your apparent position, I will respectfully ignore any acknowledgement of authority that was supposed to be generated by your invocation of his ALDER experimentation.

    I am grateful to finally see at least one comment here directly addressing the posts of mine you quoted, rather than ad verbatim repetition or irrelevant distractions from the points previously raised.

    The Bronze Age Armenians actually appeared to resemble modern Iranians quite well in several of the ADMIXTURE runs and modern Iranians tend to score the lowest GD with them. This does not contradict my earlier point that they should be not be considered as ancestral to Iranians (there is no substantial archaeological imperative to assume this outside of the Kura-Araxes culture, which was only apparent in the northwestern portion of the country). Thus, their apparent similarity to Iranians is an artefact of their status as West Asians with an appreciable pull towards a north/northeastern trajectory. This is precisely how Iranians and Kurds may be characterised relative to modern Armenians.

    Indeed, given the direction of this extra north-shifting ancestry is most likely from the steppelands in both Bronze Age Armenians and Iranians/Kurds, my prior point regarding "masking" of the steppe derived component in Iranians through qpAdm remains fully relevant.

    They are, for all intents and purposes, a Bronze Age equivalent of what eventually transpired in the Iranian plateau (as supported by archaeology, linguistics, uniparental markers etc.). No surprise they represented the bulk of Iran's ancestral make-up in these runs, despite not being ancestral to Iranians.

    Agreed. Which is precisely why modelling Pashtuns as a combination of Sintashta (ancient), Georgian (modern) and Dai (modern) is a statistical "wonder" without a definitive context.

    Population genetics is not simply a mathematical problem; a historical, linguistic and archaeological context on the backdrop of prior data must be established and/or reconciled.

    I note a complete absence of any mention of archaeological data in your rebuttals. This is unusual, given Gimbutas' models for the Eurasian steppe hypothesis (which the Haak and Allentoft papers have nicely confirmed in my opinion) was based on these, as is the very foundation of research on Indo-Iranian prehistory.

    Archaeologists have near-consistently recognised that a clear trail from Sintashta or Andronovo through to the Afghan heartland does not exist.

    If we were to model Pashtuns as mostly (>60%) Sintashta and assume the urheimat of the historical Pashtuns is the area encompassing southern Afghanistan, one would naturally expect evidence of their pastoralist expansions leading that extent southwards. Merely marvelling at the statistical strength of a model does not replace the clear objections of said model to adjunct disciplines for population genetics. In that respect, your perspective is fully analogous with Bouckaert et al.'s attempts at modelling the Indo-European languages with statistical software.

    Two questions based on this proposition:
    a) Assuming the above is your currently preferred stance, could you provide archaeological evidence of a mostly Andronovo-derived culture bypassing the BMAC en route towards southern Afghanistan?
    b) If the above is not your currently preferred stance, could you provide archaeological evidence accounting for why near-complete culturo-archaeological synthesis took place between the Andronovo tribes and BMAC, yet the Pashtun people are allegedly mostly Sintashta-derived?

    Please note this is not a strawman argument given a) Andronovo and Sintashta are commonly recognised as proto-Iranian and proto-Indo-Iranian respectively, and b) Allentoft has provided us with their genetic data.

    With all due respect, technical information regarding ADMIXTURE is, once more, irrelevant here.

    The issue, once more, is the assumption that the qpAdm statistics should be interpreted literally.

    I am familiar with qpAdm as I regularly read Eurogenes. The technical explanation will be helpful for others, no doubt.

    Your reasoning is appreciated.

    However, it is an inconsistent debating style to give a brief description of how things should be for your currently favoured scenario, whereas Iranians are re-modelled using qpAdm.

    In order to support your perspective, and in combination with the above requests, would you kindly obtain a run for the HGDP Pashtuns (all, not ten) in qpAdm using Sintashta, the Burusho and Georgians?

    For at least the third time, unnecessary repetition of an earlier point (argumentum ad nauseam). I already addressed why the fits appear paltry for Iranians.

    Finally, regarding IBD segment scoring; a user called Srkz had produced maps revealing the IBD segment sharing between various Andronovo and Sintashta samples with West and South-Central Asians was largely equivalent with the exception of Pamiri Tajiks, who exhibited a greater number. These have since been taken down. I can verify Arbogan's claim.

    I would appreciate more direct rebuttals in future to minimise the degree of (no doubt unintentional) obfuscation shown above.
    With all due respect, I don't know how you define "direct rebuttals" (and I surely don't see any of those on your part, just rather vague and "mushy" responses that don't address my main argument). Also, unfortunately, the only obfuscation that I'm seeing here is solely from your direction, which is quite disheartening. But I'm not sure if it can really be called "unintentional", in this case at least.

    Be that as it may, I've already addressed much of what you've written (and much of what you've written is just a repeat of what you wrote a few days ago, concerning these models, "argumentum ad nauseam"). But you can find some notes below. Please, do forgive the atrocious theoretical repetition that I am about to unleash (I could teach Deleuze a thing or two):

    1) One has no need to use the "ASI" concept if it doesn't mean "South Asian-specific ENA", since that is what Reich et al. were trying to communicate via "Ancestral South Indian". If you have your own idiosyncratic understanding (one which only you are privy to), you need to come up with a new personal term of art.

    2) You have not addressed why these fits appear paltry for Iranians, from any technical angle.

    3) You don't seem to understand that even if Pashtuns aren't 70%-60% Sintashta admixed in a "literal" sense, these fits basically show that 70%-60% of their genetic ancestry comes from populations that are very similar to Sintashta. That can no longer be debated, unless you want to be as flat-out disingenuous as possible.

    I can't see where that 70%-60% ancestry (that is basically identical to Sintashta) could come from, besides the steppe. Do you have any ideas that you would like to share with us?

    4) I cited Everest's previous work to show that this finding isn't unprecedented, and that it isn't limited to qpAdm.

    5) I find it rather interesting that you frown upon the drawing of a one-to-one correlation between uniparental data and the genome-wide results. Mainly, because you did just that in your previous response concerning the genetic evidence, in relation to the Iranian plateau.

    6) Finally, as one would expect, Burusho + Sintashta + Dai, does not work. But your new request is quite doable.

    HGDP Pashtuns (all samples):

    59.2% Burusho + 22.8% Sintashta + 18% Georgian


    tail probability=0.875585

    Please, pay close attention to the chisq and tail probability. Compare them to what we see with this model.

    HGDP Pashtuns (all samples):

    63.4% Sintashta + 23.1% Georgian + 13.4% Dai


    tail probability=0.975924

    If you haven't noticed, both the chisq and tail probability are "perfect", on the second model (the one which seems to annoy you). By contrast, there is room for improvement on the model you've requested. And even then, the model you requested still has a good amount of Sintashta input.

    Just a side note, but Pashto is a NE Iranian language, it is quite close to the Pamiri languages (if this is something you would like to contest, we can start citing the books and papers. Construing Pashto as "SE Iranian" is rare, and a close relationship to Pamiri languages is universally recognized). Scythian input is expected for Pashtuns, just based on that linguistic fact, and it wouldn't be surprising if Sintashta probably covers for some substantial Scythian ancestry.

    Again (and again, forgive the "argumentum ad nauseam"), even if Pashtuns aren't "literally" 60%-70% Sintashta, it is undeniably certain that 60%-70% of their ancestry is from populations exceedingly similar (in fact, basically identical) to Sintashta, and the only place for such populations is the steppe, looking at the data that we currently have.

    I'm pretty confident that South Asian aDNA will back me up, and when that occurs, I'll certainly forgive you for this whole conversation ().

    Edit: Just for the record, only a few months ago (as most can recall), I repeatedly asserted that IE expansions had little genetic impact on South Asia. One has to be open-minded, and go where the science takes us.
    The only difference here is I have used an input ancestral population with similar admixture as modern Armenians. This change from above results in repositioning of Pop X.

    Under this scenario, Pashtuns can be modeled as 43% Sintashta + 38% BA population similar in admix to modern Armenians + 19% Pop X, with Pop X's position being shifted from above.


    Sorry if im being ignorant here, but this Sintashta admixture in Pashtuns doesn't make any sense to me. Sintashta are 45% WHG, whereas Pashtun are only 1-2%. So if Pashtuns were 40% Sintashta, that would make them aroud 20% WHG...

    Quote Originally Posted by sweuro View Post
    Sorry if im being ignorant here, but this Sintashta admixture in Pashtuns doesn't make any sense to me. Sintashta are 45% WHG, whereas Pashtun are only 1-2%. So if Pashtuns were 40% Sintashta, that would make them aroud 20% WHG...
    The Sintashta-like ancestry in Pashtuns is covered under components other than WHG. ADMIXTURE clusters are not absolutes, but dependant on the number and type of samples used in the initial run. They don't need to correlate with qpAdm results or Haak et al fit results for them to work, and should not be expected to considering all the different ADMIXTURE runs around.

    An example of what I mean: Han can be represented as unmixed or significantly mixed in ADMIXTURE if an East Asian component isn't allowed to form. In the recent Eurogenes ancient genome runs they are a mix of East Asian and ASE, in Verenich's newest calculator they are a mix of East Siberian and Oceanian and in the analysis below they are a mix of Native American, SSA and West Eurasian.

    A few points on Pakhtoon, Sintashta, Dai etc.

    1. Modern Pakhtoons as an input to Sintashta has to fail as Pakhtoons have ASI which is near South Asia specific. Non HGDP Pakhtoons as an input may look better as they have less ASI, but then they have higher SW Asian which would create the same problem as ASI. Folk should be careful with ASI - it is a very old 40000+year separation from ASI ancestors which included the Dai, Ongee, ANE etc., and ASI is not extant in an unadmixed form (the Dai is representing ASI here much as Reich uses the Santhal).

    2. The R1a1 Y chromosome overlap of the Pakhtoons and Sintashta folk are in line (along with their xM xU2i mtDNA).

    3. Moorjani et al. noted that the Pathan show a single pulse admixture dated to 2,117ybp (73+-9 gens) between ANI and ASI at 71% of the former. Again this is very consistent with the Sintashta Dai type admixture.

    4. I think WHG may be a red herring. While Sintashta may indeed turn out to a have a North Central Euro WHG type component, I have not yet seen good evidence as to how Sintashta's WHG differs from shared ancestry among all western hunter gatherers.

    I would add a couple more points in general.

    I have this theory of native component recovery after admixture. Some native markers are selected for and they bring along with them the recovery of their associated component. Recovery of WHG after admixture would therefore be expected in areas where WHG was native and similarly EEF would recover where EEF was native. The converse is also true - Euro WHG type markers would not be selected for in South Asia, and the component would reduce over time in South Asia.

    We should also not discount the possibility that the Sintashta type was not very different from the population living in northern South Asia in that time frame. Y-R2, other types of Y-R1, mtDNA U etc. were all likely present in northern South Asia - the Jambudvipa of the old annals.

    I would like to thank Sein for all his informative posts I have enjoyed reading. I also admire his courtesy, professionalism, helpfulness, and being considerate of others' viewpoints, in very Pashtun like fashion . I think he is a great asset here.

    Although I concede that qpAdm is based on more modern & robust algorithms than the slightly older algorithms used in admixture calculators, the output in qpAdm is still dependent on the inputs, and admixture calculation is still a necessary part of the analysis and comparison of ancient genomes to modern ones.

    Sein wrora, the issue many members (me included) are having with the >60% Sintashta based Pashtun model is that it highly dependent on Dai being an input population. As soon as you replace Dai, with a geographically more proximate Indian population to Pashtuns for the source of ASI/S Indian admixture present in Pashtuns, the >60% Sintashta input becomes invalid. Now there is really no evidence to suggest that Pashtuns obtained their ASI/S Indian admixture from Dai or any Dai like population. In fact it is much more likely that they did not, in light of the geographic separation.

    In fact I am not even sure we can rule out Afansievo as a source of steppe ancestry in Pashtuns either.

    Here is a suggestion, fix one input population in qpAdm as a steppe population as 1st Sintashta, and then 2nd as Afansievo. Fix the second input population as BA Armenian, or something else from W Asia, and use various Indian populations as the 3rd input ( I realize there are many ) and see what type of outputs you get.

    Also, if David posts a K8 PCA with Indian populations, that may narrow the search a little, because I believe that qpAdm results should be consistent with PCA based 3 population modeling.

    Finally, with Pashtun areas being at the crossroads of population migratory paths in Asia, a 4 population model can not be ruled out either.

    I thought the Moorjani paper was deemed defunct in light of the information which came about ANE afterwards.

    People can speculate all they want but one needs the genomes from Bronze Age SC Asia/ NW South Asia to come up with something concrete.

