Are you Han? My paternal ancestry - 12 Marker Results
As mentioned previously, I swabbed my cheeks and mailed my Y-DNA test specimen to FamilyTreeDNA a few months ago. Now the first part of the results, consists of twelve markers, are known. These are standard Y-STR results. Before showing the actual numbers, let me briefly explain.
What I ordered was a Y-DNA test, which is only applicable for men, since women do not have Y-chromosome. Y-chromosome passes from fathers to sons, so it can be used to track male ancestry. Chromosomes consist of a large number of nucleotides, which have four basic types, designated by letter A, C, G, and T. At certain locations of the Y chromosome (these locations are usually labeled by names such such "DYS393"), there are repeated sequence of short letter patterns, such as "GTT", called Short Tandem Repeats (STR). For example, at location DYS426, a person may have 7 to 18 repeats of the DNA sequence “GTT”, depending on the individual. In the FamilyTreeDNA database, 12 repeats at DYS426 is the most common, which would look like this: ...TGTGTTGTTGTTGTTGTTGTTGTTGTTGTTGTTGTTGTTGAC... By comparing the number of such STRs at different locations, we can estimate how genetically close two individuals are. And we call the number of STRs at each of these locations a marker.
Because STRs mutate more easily, two individuals with a similar Y-STR haplotype may not necessarily share a similar ancestry. To really confirm one's haplogroup type, Single Nucleotide Polymorphism (SNP) test is needed, which examines single nucleotides at specific locations on the Y chromosome. Since these SNPs are very rare, people sharing the same SNPs almost certainly have the same ancestor many generations ago. In summary, STR test is good for finding recent relatives, while SNP can confirm one's ancient ancestry (i.e. haplogroup).
The table below shows the results of 12 markers for my Y-DNA test.
I have yet to figure out that those stars mean.
OK, the table looks nice, but what does it mean? Well, given this data, we can do at least two things: the first is to find one's remote genetic cousins. Another thing is to predict one's haplotype.
I searched the FamilyTreeDNA's database to find people with test results that match mine. BTW, this search functionality is also available at ysearch.org. Guess what? I found one person with an exact match! And I know his name! I have not contacted that person yet, because I want to wait for more results to be sure. That person did a 25 marker test, so I will wait for my 25 marker results.
However, I have not resisted the temptation to investigate a bit about his background. Curiously, his oldest known ancestor is from Philippine, dating back to late 19th century. On the other hand, according to our family records, my father's line has lived in Sichuan province in southwestern China since Ming Dynasty, way before 19th century!
So what's the connection here? There seems to be so much room for speculation. One possibility is that this Filipino family was originally migrated from China to Philippine a few hundred years ago. Given this family rather scholastic background (complete with a prominent medical scientist in the family that has his own Wikipedia page), this hypothesis doesn't sound too far off, as Chinese people are known to keep their scholastic habit even overseas. Another evidence is the similarity of surnames. This family's surname is pronounced vaguely similar to mine, so it could be a local adaptation of the original Chinese surname.
Another possibility, is of course that my family was actually migrated to Sichuan from the pacific coast (maybe even the islands) somewhere during the Ming Dynasty. This would not sound too far off either, since my family records does say that we migrated to Sichuan during Ming Dynasty. However, it is vague on the origin of the migration.
In any case, this seems to be getting very interesting. I am looking forward to seeing more test results.
People have developed statistical software to predict haplogroup from STR test results. For example, Whit Athey's predictor is popular for people of European or Mideastern origins, but it doesn't work for Asians perhaps due to a lack of data. On the other hand, FamilyTreeDNA does prediction for all customers. Before showing the prediction for my haplogroup, let's recap my own prediction in the previous post: my hypothesis is that I have 20 percent chance to be in haplogroup K*, 20 percent chance in haplogroup O2, 30 percent in O1 and O3, respectively.
Well, how did I do?
Pretty good, in fact. Basically, before seeing the results, I predicted that I have 20+30+30=80 percent chance to be in the haplogroup O. It turns out that FamilyTreeDNA's prediction is just that: I am in the haplogroup of O!
Further results should get me closer to the exact haplogroup within O. But to have some fun, let's predict again which sub-group I am in. Given that one exact match I have in the database is someone originally from Philippine, the chance of my being in O1 should be significantly increased, whereas the chance of being O2 should be diminished, down to zero. Therefore, my hypothesis is the following:
P(O1) = 0.7
P(O3) = 0.3
Let's wait and see...
UPDATE (3/7/2012): I found another haplogroup predictor, Vadim Urasin's YPredictor, which can handle Eastern Asian results. The prediction for my 12 marker results is the following:
N Haplogroup Probability
1 O3-M122 77%
What a surprise! So I am a typical Han Chinese after all.
On the other hand, the accuracy of this predictor may not be too high for Eastern Asians due to a lack of data. Anyway, looking forward to results from more markers to confirm...