PDA

View Full Version : How to see STR matches beyond what FTDNA allows



MitchellSince1893
09-26-2015, 04:40 PM
This may be old news to many of you but I just found this. http://www.semargl.me/

Well to be exact, I knew about the website but hadn't been on it in a while and this feature was new to me.

1. Click the search tab.
2. Enter your FTDNA kit number (for your y-dna STR test)
3. Hit the "search" button
4. Click "Search for Genetic matches" link
5. Select the number of STRs you want to compare (111, 67, 37, 25, or 17)
6. Enter the maximum GD to filter your results. For example, click 37 and enter 6. On FTDNA you can't see matches above a GD of 4 so this lets you see matches you can't see on FTDNA.
7. The results window will appear with your matches based on your GD filter setting. You will see the marker # (e.g. 37, 67) and the GD using 2 different methods.
8. Click on the hyperlink for one of your matches to see additional info. You will see the hyperlink for the project they are in, which can be used on the Y-Utility website discussed below

I'm not sure how comprehensive their database is, but I've found matches on here that I wasn't aware of...and I've been pretty thorough at manually going through various FTDNA projects using http://www.mymcgee.com/tools/yutility111.html to cut and paste to find matches.


If you aren't familiar with the mymcgee Y-Utility tool, all you do is find your STR values in a project you are in and paste to put them in the top row of the "Paste haplotype or setup data here:" box. Then go to another y dna classic project page you want to compare your STRs to, for example:

1. I'm in the U152 project (among others), so I find my kit# on the Y-DNA Classic Chart https://www.familytreedna.com/public/R1b-U152?iframe=yresults. It's a large project so I have to go to page 2 to find my kit#.
2. I place my cursor right before my kit#, hold down the left mouse button and pull the mouse down to next line to highlight my line of STRs.
3. I copy by hitting <CTRL> C
4. Switch to Y-Utility: Y-DNA Comparison Utility, FTDNA 111 page at http://www.mymcgee.com/tools/yutility111.html
5. Place your cursor in the top row of the "Paste haplotype or setup data here:" box
6. Hit <CTRL> V to paste my kit# and STR values
7. Go to project you are interested in and go to the Classic Y-DNA results page (You can use the web address for a match you found at the semargl site to go to these matches).
8. You can either highlight the whole page or just a section. If you just want a section of a project, highlight it copy and paste it below you entry on the mcgee tool. You can compare the whole page to your kit, but if I go over ~400 entries it takes a while to process. When I get around 500 entries my computer will sometimes hangup and not complete the process.
9. I will use the Anglo Saxon Project as an example: https://www.familytreedna.com/public/AngloSaxonydnaproject/default.aspx?section=yresults
10. Change the markers to 37 (not required, but for me anything less than this isn't very helpful, and it will speed up the process)
11. Press <CTRL> A to select the whole page
12. Press <CTRL> C to copy
13. Switch to the Y-Utility: Y-DNA Comparison Utility, FTDNA 111 page http://www.mymcgee.com/tools/yutility111.html
14. Click in the box below your STR line that you previously pasted
15. Press <CTRL> V to paste
16. Above the window (still on the mymcgee page) uncheck the "FTDNA order haplotype comparison" to speed up the process (not required)
17. Hit the "EXECUTE" button above the box. A new window will appear...with a large amount of data it will remain blank for a while until complete
18. If you get a "Page is unresponsive" window hit "WAIT" If you've pasted a lot if data in the box you may have to click "wait" a few times. In this example it took about 3 mins and I hit "Wait" about 3 times.
19. Your new window will be populated. Go down the left column with your kit# header of first grid to see your GD compared to other kits.
20. In the grid below you can see the year estimated to your shared MRCA. Again go down the column below your kit # header to see age estimates

There are 2 GD methods used on Y-Utility. Infinite and Hybrid. Infinite will indicate a "1" for every STR that is different no matter the difference. Hybrid will show the total difference between 2 STR values (except for some STRs with multiple values). Hybrid GD will either be equal to or higher than the Infinite method. Infinite is the default setting.

dp
09-26-2015, 06:10 PM
Hey, that's a trip. :)

Unfortunately my closest Y-STR match has not had SNP testing...

I browsed around my 25 STR matches, and at 25-2 level found at least two L21xM222 kits, that are not DF49 tested.
I came across a project I've never heard of: R-L21SouthIrish (https://www.familytreedna.com/groups/r-l21-south-irish/about/background)
I have some matches in southern ireland so maybe I'll check it out.

Again, Thanks
dp
:biggrin1:

PS: They are concentrating on CTS4466, so not for me.

haleaton
09-28-2015, 04:15 AM
Thanks this was really helpful to me. I was curious if you found a way to use Excel as an intermediate way to store STRs and modify them? Seems to drop STRs cutting pasting and cutting from Excel, but Notepad works just fine. Probably some special character or reformatting begin done.

This would also help data mining those Surname projects that used World Families from way back when.

I would also mention that http://www.semargl.me/ sometimes has been down but so far has always come back.

Edit: I answered my own first question which is format cells as text and paste as definition. Excel was converting text in dates . . .

haleaton
10-01-2015, 01:51 AM
This may be old news to many of you but I just found this. http://www.semargl.me/

Well to be exact, I knew about the website but hadn't been on it in a while and this feature was new to me.

1. Click the search tab.
2. Enter your FTDNA kit number (for your y-dna STR test)
3. Hit the "search" button
4. Click "Search for Genetic matches" link
5. Select the number of STRs you want to compare (111, 67, 37, 25, or 17)
6. Enter the maximum GD to filter your results. For example, click 37 and enter 6. On FTDNA you can't see matches above a GD of 4 so this lets you see matches you can't see on FTDNA.
7. The results window will appear with your matches based on your GD filter setting. You will see the marker # (e.g. 37, 67) and the GD using 2 different methods.
8. Click on the hyperlink for one of your matches to see additional info. You will see the hyperlink for the project they are in, which can be used on the Y-Utility website discussed below

I'm not sure how comprehensive their database is, but I've found matches on here that I wasn't aware of...and I've been pretty thorough at manually going through various FTDNA projects using http://www.mymcgee.com/tools/yutility111.html to cut and paste to find matches.


If you aren't familiar with the mymcgee Y-Utility tool, all you do is find your STR values in a project you are in and paste to put them in the top row of the "Paste haplotype or setup data here:" box. Then go to another y dna classic project page you want to compare your STRs to, for example:

1. I'm in the U152 project (among others), so I find my kit# on the Y-DNA Classic Chart https://www.familytreedna.com/public/R1b-U152?iframe=yresults. It's a large project so I have to go to page 2 to find my kit#.
2. I place my cursor right before my kit#, hold down the left mouse button and pull the mouse down to next line to highlight my line of STRs.
3. I copy by hitting <CTRL> C
4. Switch to Y-Utility: Y-DNA Comparison Utility, FTDNA 111 page at http://www.mymcgee.com/tools/yutility111.html
5. Place your cursor in the top row of the "Paste haplotype or setup data here:" box
6. Hit <CTRL> V to paste my kit# and STR values
7. Go to project you are interested in and go to the Classic Y-DNA results page (You can use the web address for a match you found at the semargl site to go to these matches).
8. You can either highlight the whole page or just a section. If you just want a section of a project, highlight it copy and paste it below you entry on the mcgee tool. You can compare the whole page to your kit, but if I go over ~400 entries it takes a while to process. When I get around 500 entries my computer will sometimes hangup and not complete the process.
9. I will use the Anglo Saxon Project as an example: https://www.familytreedna.com/public/AngloSaxonydnaproject/default.aspx?section=yresults
10. Change the markers to 37 (not required, but for me anything less than this isn't very helpful, and it will speed up the process)
11. Press <CTRL> A to select the whole page
12. Press <CTRL> C to copy
13. Switch to the Y-Utility: Y-DNA Comparison Utility, FTDNA 111 page http://www.mymcgee.com/tools/yutility111.html
14. Click in the box below your STR line that you previously pasted
15. Press <CTRL> V to paste
16. Above the window (still on the mymcgee page) uncheck the "FTDNA order haplotype comparison" to speed up the process (not required)
17. Hit the "EXECUTE" button above the box. A new window will appear...with a large amount of data it will remain blank for a while until complete
18. If you get a "Page is unresponsive" window hit "WAIT" If you've pasted a lot if data in the box you may have to click "wait" a few times. In this example it took about 3 mins and I hit "Wait" about 3 times.
19. Your new window will be populated. Go down the left column with your kit# header of first grid to see your GD compared to other kits.
20. In the grid below you can see the year estimated to your shared MRCA. Again go down the column below your kit # header to see age estimates

There are 2 GD methods used on Y-Utility. Infinite and Hybrid. Infinite will indicate a "1" for every STR that is different no matter the difference. Hybrid will show the total difference between 2 STR values (except for some STRs with multiple values). Hybrid GD will either be equal to or higher than the Infinite method. Infinite is the default setting.

I had some free time so I spent it looking at this and I really have finally got a better gut-feeling understanding of STR statistics ...

A few Tech tips from my limited abilities and experience:

1) Using Excel as a intermediary with Cut and Paste to the STR web tool made things very fast using sorting to remove other Y Haplogroups.
2) Cntr-A then CopyPasta from the FTDNA STR pages set to 5000 was best way when things are chuggin ...
3) Chrome seems to chugg on Copy and IE on Paste, Paste even killed Excel once. Save, Save, Save.
4) You can go through all the nearer STR matches from http://www.semargl.me/ in a few hours and it is mind blowing.
4a) In Surname Projects you often have multiple samples. Each Project had very different groupings or efforts to do so.
4b) In a well run Surname Project with a lot of member there are multiple samples, closely related, which proves a distribution of STR values to compare rather than single sample.
5) Very few samples from Surname Projects and Geographic Projects are in any of the the Y Haplogroup Projects. There is data out there.
6) The WorldFamilyTrees.net pages that do not provide a backup which include the FTDNA STR Format are surely lacking. Otherwise, I had to hand edit.
7) The Surname Projects that do not provide STR public data are either Fools to Citizen Science or Privacy Visionaries . . .

What really blew my mind was to compare two groups made up of Surname Projects comparing both STR modalities, GD, and TMRCA (balancing 37, 67, 111 data sets) with multiple samples due nature of Surname Projects.

STRs have a statistical distribution with sometimes extreme outliers that show recent matches that, if you look at a large set of data from Surname Projects, can be understood as false.

There may be an echo Tsunami into the Surname and Geographic Projects, if Y-Haplo participants find it of interest and go there.