PDA

View Full Version : Uyghur Y-DNA analysis (STR predicted) (Shan et al. 2014)



DMXX
02-08-2017, 08:05 PM
I specifically wanted to investigate Uyghur R1b, but went overboard with the numerical analysis (maths can be fun sometimes). Haven't shared the haplotypes publicly as I'm uncertain whether they're publicly viewable.

Paper unfortunately didn't test for SNPs. Ran all the Uyghur haplotypes (STR=17, n=197) through Urasin's YPredictor. Subclade prediction probability threshold set to >=80%. Resulting set was n=156.



C3-M217 3.85% 6/156
D-M174 1.92% 3/156
E1b-M35 1.28% 2/156
F-M89 0.64% 1/156
G-M201 1.28% 2/156
G2-P287 1.92% 3/156
G2a-P15 4.49% 7/156
H-M69 2.56% 4/156
I2-M223 0.64% 1/156
J1-M267 0.64% 1/156
J1c-L136 0.64% 1/156
J2a-... 13.46% 21/156
J2b-M12 1.28% 2/156
L-M11 4.49% 7/156
L1-M27 0.64% 1/156
NO-M214 1.28% 2/156
N-M231 0.64% 1/156
O-M175 0.64% 1/156
O1-MSY2.2 0.64% 1/156
O2-P31 0.64% 1/156
O3-M122 6.41% 10/156
P-M45 2.56% 4/156
Q-M242 3.21% 5/156
Q1a-MEH2 0.64% 1/156
R1a-M198 30.77% 48/156
R1b-M343 0.64% 1/156
R1b-P25(xP297) 2.56% 4/156
R1b-M269 3.21% 5/156
R1b-M73 3.21% 5/156
R2a-M124 5.13% 8/156


Comments for each haplogroup:

C-M217 is clearly due to a founder effect, eyeballing the STRs. Looks like a star cluster, very young. If I were to guess, tied to the original Uyghurs who entered the Tarim in the 9th century AD.
Ditto comments above with D-M174.
E, G fairly diverse.
H haplotypes differ by one mutation each way, not as tight as C or D. Could be looking at ancient MRCA.
I2-M223 appears here. Please see discussion in Y-DNA I section, this isn't unexpected.
J2's an eclectic mix of diversity with handfuls of shared haplotypes. Complex history with this one.
L and O mirror J in terms of paradoxical large internal diversity. Though some markers deviate massively from the norm. Not sure how typical this is for their subclades.
The "P"'s, if real, come from what looks like an ancient founder. There's either 0-step or 1-step mutations across most of the STR sites.
Q-M242 looks more diverse than P, but I'm seeing two clusters forming within them here.
R1a1a makes up the biggest chunk of Uyghur lines. Most haplotype values differ between 0-1 from another. I do not see any clear substructure within them. This data's a prime instance of where Y-SNP testing is quite useful in distinguishing groups.
The R1b subclades superficially look about as homogenous as R1a. I'll be performing variance calculations per predicted subclade at the end to give everyone an empirical measurement (rather than my subjective eye-balling).
R2a is surprisingly diverse; only explanation is multiple migration events into the Tarim from the west and south.


Variance (averaged across STRs) for all the R subclades below:




R1a1a-M198 0.131884058 0.873913043 0.662801932 0.517874396 0.328985507 0.398067633 0.693719807 1.011111111 0.146859903 0.066183575 0.155072464 0.388888889 0.427536232 0.042512077 0.199516908 0.41352657 0.32705314 0.399147485
R1b-P25 0.25 0.25 0.25 1.583333333 1 0.333333333 2.916666667 0.333333333 0 0 0.916666667 0.25 0 0 0 0.25 1.583333333 0.583333333
R1b-M269 0.25 0 0.25 0.333333333 0 0 0.25 0.916666667 0.25 0.25 0 0.666666667 0 0.916666667 1.583333333 0.25 0.333333333 0.367647059
R1b-M73 0 0.2 0.2 0 0.2 0.2 0 0.7 0 0 0 0 0 0 0 0.3 0 0.105882353
R2a-M124 1.142857143 2.410714286 0.125 0.410714286 0.285714286 0.319285714 0.696428571 0.857142857 0.571428571 0.285714286 0.125 0.267857143 0.214285714 0.839285714 0.125 0.410714286 6.125 0.894831933



The degree of variance, from most to least diverse, is R2a-M124 > R1b-P25 > R1a1a-M198 > R1b-M269 > R1b-M73.

I caution against over-interpreting the above pattern. Far more data to even modestly frame the temporospatial introduction of each subclade into the Tarim basin is required, alongside aDNA (Xiaohe's purported R1a-Z93- status doesn't assist much here).

However, with that in mind, it looks quite clear that R1b-M73's presence in the Tarim is currently very drifted (EMBA introduction that experienced a bottleneck in the Tarim? Or Medieval Turkish introduction with an ultimately EMBA origin?), whereas R2a-M124 seems to have continuously leaked into the region with time. R1a1a looks to have done the same, albeit within a narrower timeframe, and with a larger volume (continuous waves of LNBA steppe movements beginning with Andronovo and ending in Medieval times? That is my proposal). The other R1b subclades are much harder to frame into any sort of scenario.

Comments welcome. Special thanks to jdean for helping me with the data processing!

Joe B
02-08-2017, 08:23 PM
You're looking at the most difficult area of the R1b tree to sort out the subclades by STR. Wondering if any of your haplotypes show any similarities to the STRs in the R1b Basal Subclades project. The first page of the the Y-DNA Colorized Chart may have some modals to work with. https://www.familytreedna.com/groups/r-1b-basal-subclades/dna-results

DMXX
02-08-2017, 08:33 PM
You're looking at the most difficult area of the R1b tree to sort out the subclades by STR.

Correct. Making matters worse is the absence of SNP backbone testing in practically all the extended Y-STR papers involving the Uyghurs (all I've seen so far are from China).



Wondering if any of your haplotypes show any similarities to the STRs in the R1b Basal Subclades project. The first page of the the Y-DNA Colorized Chart may have some modals to work with. https://www.familytreedna.com/groups/r-1b-basal-subclades/dna-results

Very useful, thank you. I'll check those out soon if I have time and will report back.

parasar
02-08-2017, 08:48 PM
... R1a1a makes up the biggest chunk of Uyghur lines. Most haplotype values differ between 0-1 from another. I do not see any clear substructure within them. This data's a prime instance of where Y-SNP testing is quite useful in distinguishing groups ...

Comments welcome. Special thanks to jdean for helping me with the data processing!


We know that some of the Uyghur R1a is of the L657 type.
L657,Y9 and L657,Y9-,Y8-

http://eng.molgen.org/download/file.php?id=548&sid=823490d1ef4addff6eb754144ee76699&mode=view
http://eng.molgen.org/download/file.php?id=548&sid=823490d1ef4addff6eb754144ee76699&mode=view

DMXX
02-11-2017, 03:59 PM
Dug a bit further into the predicted R1b-M73.

Compared the haplotypes with some SNP confirmed R1b-M73 from Altaian Kazakhs (Dulik et al.). They're all a part of the same cluster (a 17/17 match between two of them).

Looks like the predictor was correct (the >80% probability threshold did its' job here) and that very tentatively supports the second proposal:



Medieval Turkish introduction with an ultimately EMBA origin

Afshar
03-05-2017, 10:54 AM
I specifically wanted to investigate Uyghur R1b, but went overboard with the numerical analysis (maths can be fun sometimes). Haven't shared the haplotypes publicly as I'm uncertain whether they're publicly viewable.

Paper unfortunately didn't test for SNPs. Ran all the Uyghur haplotypes (STR=17, n=197) through Urasin's YPredictor. Subclade prediction probability threshold set to >=80%. Resulting set was n=156.



C3-M217 3.85% 6/156
D-M174 1.92% 3/156
E1b-M35 1.28% 2/156
F-M89 0.64% 1/156
G-M201 1.28% 2/156
G2-P287 1.92% 3/156
G2a-P15 4.49% 7/156
H-M69 2.56% 4/156
I2-M223 0.64% 1/156
J1-M267 0.64% 1/156
J1c-L136 0.64% 1/156
J2a-... 13.46% 21/156
J2b-M12 1.28% 2/156
L-M11 4.49% 7/156
L1-M27 0.64% 1/156
NO-M214 1.28% 2/156
N-M231 0.64% 1/156
O-M175 0.64% 1/156
O1-MSY2.2 0.64% 1/156
O2-P31 0.64% 1/156
O3-M122 6.41% 10/156
P-M45 2.56% 4/156
Q-M242 3.21% 5/156
Q1a-MEH2 0.64% 1/156
R1a-M198 30.77% 48/156
R1b-M343 0.64% 1/156
R1b-P25(xP297) 2.56% 4/156
R1b-M269 3.21% 5/156
R1b-M73 3.21% 5/156
R2a-M124 5.13% 8/156


Comments for each haplogroup:

C-M217 is clearly due to a founder effect, eyeballing the STRs. Looks like a star cluster, very young. If I were to guess, tied to the original Uyghurs who entered the Tarim in the 9th century AD.
Ditto comments above with D-M174.
E, G fairly diverse.
H haplotypes differ by one mutation each way, not as tight as C or D. Could be looking at ancient MRCA.
I2-M223 appears here. Please see discussion in Y-DNA I section, this isn't unexpected.
J2's an eclectic mix of diversity with handfuls of shared haplotypes. Complex history with this one.
L and O mirror J in terms of paradoxical large internal diversity. Though some markers deviate massively from the norm. Not sure how typical this is for their subclades.
The "P"'s, if real, come from what looks like an ancient founder. There's either 0-step or 1-step mutations across most of the STR sites.
Q-M242 looks more diverse than P, but I'm seeing two clusters forming within them here.
R1a1a makes up the biggest chunk of Uyghur lines. Most haplotype values differ between 0-1 from another. I do not see any clear substructure within them. This data's a prime instance of where Y-SNP testing is quite useful in distinguishing groups.
The R1b subclades superficially look about as homogenous as R1a. I'll be performing variance calculations per predicted subclade at the end to give everyone an empirical measurement (rather than my subjective eye-balling).
R2a is surprisingly diverse; only explanation is multiple migration events into the Tarim from the west and south.


Variance (averaged across STRs) for all the R subclades below:




R1a1a-M198 0.131884058 0.873913043 0.662801932 0.517874396 0.328985507 0.398067633 0.693719807 1.011111111 0.146859903 0.066183575 0.155072464 0.388888889 0.427536232 0.042512077 0.199516908 0.41352657 0.32705314 0.399147485
R1b-P25 0.25 0.25 0.25 1.583333333 1 0.333333333 2.916666667 0.333333333 0 0 0.916666667 0.25 0 0 0 0.25 1.583333333 0.583333333
R1b-M269 0.25 0 0.25 0.333333333 0 0 0.25 0.916666667 0.25 0.25 0 0.666666667 0 0.916666667 1.583333333 0.25 0.333333333 0.367647059
R1b-M73 0 0.2 0.2 0 0.2 0.2 0 0.7 0 0 0 0 0 0 0 0.3 0 0.105882353
R2a-M124 1.142857143 2.410714286 0.125 0.410714286 0.285714286 0.319285714 0.696428571 0.857142857 0.571428571 0.285714286 0.125 0.267857143 0.214285714 0.839285714 0.125 0.410714286 6.125 0.894831933



The degree of variance, from most to least diverse, is R2a-M124 > R1b-P25 > R1a1a-M198 > R1b-M269 > R1b-M73.

I caution against over-interpreting the above pattern. Far more data to even modestly frame the temporospatial introduction of each subclade into the Tarim basin is required, alongside aDNA (Xiaohe's purported R1a-Z93- status doesn't assist much here).

However, with that in mind, it looks quite clear that R1b-M73's presence in the Tarim is currently very drifted (EMBA introduction that experienced a bottleneck in the Tarim? Or Medieval Turkish introduction with an ultimately EMBA origin?), whereas R2a-M124 seems to have continuously leaked into the region with time. R1a1a looks to have done the same, albeit within a narrower timeframe, and with a larger volume (continuous waves of LNBA steppe movements beginning with Andronovo and ending in Medieval times? That is my proposal). The other R1b subclades are much harder to frame into any sort of scenario.

Comments welcome. Special thanks to jdean for helping me with the data processing!

Do you know how this compares with the distribution in Yugurs ( yellow Uighurs)?

DMXX
03-05-2017, 02:50 PM
Do you know how this compares with the distribution in Yugurs ( yellow Uighurs)?

I don't have their data, unfortunately. If you know of a study that sampled them, I'd appreciate receiving the title.

The Afanasievo R1b confirmation brought to us by Agamemnon and Kristiina at Eurogenes puts things in some context here - We now have an EMBA source for Uyghur R1b. The big question is whether or not R1b-M73 specifically was present in Afanasievo or not. Of the three samples we have, two are R1b-M269, one is R1b-P297. The last sample could be M73+.

The new data helps put the variance calculations in perspective. Increased variance broadly correlates with a later MRCA. If the correlation holds in this instance and is correct, then it would suggest that modern Uyghurs derive their Tocharian ancestry, quite ironically, from the Medieval Uyghur side of their heritage, rather than the "indigenous" Tarim. At least from the paternal side*. One wonders if the opposite pattern holds up on the maternal.

One rather curious outcome of these recent revelations is that Oghuz Turkic speakers (Turks, Azeris, Turkmen) can now entertain the notion that some of the R1b-M269 in Turkey, Iran, Azerbaijan and Turkmenistan is Turkish in origin. The recent Scythian paper found their eastern Scythian samples were effectively Yamnaya/Afanasievo+Han combinations, and Central Asian Turks harboured most of their direct ancestry.

* Uyghurs being ~30% R1a1a yet only ~6% R1b, Iron Age Xiaohe being fully R1a-Z93-, and R1a's numerical dominance over R1b across all of Central Asia aside from specific niches in E-C Asia suggest that the Tarim, before the Medieval Uyghur conquest, was R1a1a dominated to begin with. The archaeological data indicates Andronovo effectively replaced Afanasievo anyway.

Silesian
03-05-2017, 03:25 PM
Points of interest from various papers and yfull with regards M73 and M269;
https://www.yfull.com/tree/R-CTS8966*/ R1b found in Han Chinese

How Genghis Khan and the Borjigin clan connect with burials from Talvan Tolgoi R1b-and are they connected with Uyghur R1b
[the Y-STR profiles of the 3 individuals (Hui, Kalmyk, and Uzbek).]
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0161622

Difference in R1b clades between Turkmen-Lurs and Zoroastrians( M269)


Turkmen came from the Altai Mountains in the 7th century AC, through the Siberian steppes. They now live in Golestan and are different from the other ethnic groups in appearance, language and culture.

Zoroastrians are the oldest religious community in Iran; in fact the first followers have been the proto-Indo-Iranians. With the Islamic invasions they were persecuted and now exist as a minority in Iran.
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0041252

Afshar
03-06-2017, 10:11 AM
I don't have their data, unfortunately. If you know of a study that sampled them, I'd appreciate receiving the title.

The Afanasievo R1b confirmation brought to us by Agamemnon and Kristiina at Eurogenes puts things in some context here - We now have an EMBA source for Uyghur R1b. The big question is whether or not R1b-M73 specifically was present in Afanasievo or not. Of the three samples we have, two are R1b-M269, one is R1b-P297. The last sample could be M73+.

The new data helps put the variance calculations in perspective. Increased variance broadly correlates with a later MRCA. If the correlation holds in this instance and is correct, then it would suggest that modern Uyghurs derive their Tocharian ancestry, quite ironically, from the Medieval Uyghur side of their heritage, rather than the "indigenous" Tarim. At least from the paternal side*. One wonders if the opposite pattern holds up on the maternal.

One rather curious outcome of these recent revelations is that Oghuz Turkic speakers (Turks, Azeris, Turkmen) can now entertain the notion that some of the R1b-M269 in Turkey, Iran, Azerbaijan and Turkmenistan is Turkish in origin. The recent Scythian paper found their eastern Scythian samples were effectively Yamnaya/Afanasievo+Han combinations, and Central Asian Turks harboured most of their direct ancestry.

* Uyghurs being ~30% R1a1a yet only ~6% R1b, Iron Age Xiaohe being fully R1a-Z93-, and R1a's numerical dominance over R1b across all of Central Asia aside from specific niches in E-C Asia suggest that the Tarim, before the Medieval Uyghur conquest, was R1a1a dominated to begin with. The archaeological data indicates Andronovo effectively replaced Afanasievo anyway.

https://www.ncbi.nlm.nih.gov/pubmed/18428013 Its behind a paywall unfortunately.
But I remember seeing higher percentages of Hg O in Yugurs. The Yugurs left the Tarim in about 840 after the collapse of the Uyghur Khanate, so if there would be some medieval Turkish input, we should be able to see some differences in Y-Hgs in Uygur/Yugur (altough its only a single paper).

Megalophias
03-06-2017, 04:27 PM
There is a sample of 32 Yugurs from Gansu in Shou et al (2010), "Y-chromosome distributions among populations in Northwest China identify significant contributions from Central Asian pastoralists and lesser influence of western Eurasians". There are 8 STR hapotypes for haplogroups R1 and J only.

The Yugur R1 was reported as R1-M173(xR1b-M343, R1a1-M17). But the haplotypes have DYS390=19, which I gather is usually R1b-M73.

Afshar
03-07-2017, 08:04 AM
In "Eurasian Crossroads: A History of Xinjiang" it states that according to some scholars the Uyghur tribes that settled in Turfan/Beshbaliq/Kucha were actually Toqquz Oghuz from the west that fled from the Kirghiz assaults. So there is a good possibility some of these hgs are actual Turkish medieval input.