European County of Origin Maps

From mtDNA Haplogroup K Project Members and Matches

At 400 Members

(Revised April 15, 2007)

 

I recently created maps of the frequencies of several mtDNA Haplogroup K subclades from counts of the K Project members and their exact matches in the FTDNA database. More may be added in the future, so check back. However, some subclades and clusters, such as K1a4a1 and “K1a11,” are still too small for a good map. The methodology for all maps is explained in the discussion for the first map, so don’t just skip to your subclade. The size of the circles may not be comparable between the different maps. I discussed these subclades recently in my most recent MitoSearch K1000 Survey.

 

“K1a10” and “Pre-K1a10”: The first map is for the 16048A cluster, to which I’ve given the temporary name K1a10. I looked at the listed countries of origin of the current FTDNA-tested members of the K Project, plus those of their exact matches in the full FTDNA database. Of the 387 members when I checked, 26 (that number hasn’t changed at 400 members, but the number of matches and counts for countries may have changed slightly) had 16048A in 10 different high-resolution haplotypes or five different low-resolution HVR1 haplotypes. Only three of the 26 didn't have HVR2 results. However, since 16048A is a positive indicator of the cluster, I used all 26 and matches for this map. By the way, the total of those with 16048A at FTDNA has now reached at least 70, which is up a lot from the 54 the last time I counted. There could be others in haplotypes not represented in the Project. Of those 70, only 27 have HVR2 results. That means only four with high-res results are not in the K Project, no doubt a higher percentage than in most subclades of K. That high percentage is due to some recruiting by those in the cluster – and, I’ll confess, by me, to get representatives of the more unusual haplotypes.

 

I counted the number of total mtDNA entries, both hi- and low-res for each country, while looking at the matches. Then I divided the low-res numbers for each country by the number of those in 16048A listing that country. The highest frequency - no surprise - was Ireland with 0.86%. If that number looks small, remember that it is 16048A's out the entire number of mtDNA results from every haplogroup. The raw count was also the highest at 24. The next highest frequency was Norway at 0.39%, but that's only from two people. The lowest frequency was Germany at 0.06%. Only eight countries were listed with 16048A. The map is at K1a10 map. 

 

I think the map supports our theory that the 16048A mutation occurred in Ireland and spread out from there. I have previously said that I thought the mutation occurred in a person in K1a with 195C and probably two pairs of the position 524 insertions, the cluster I have called "Pre-K1a10." (It's certainly interesting to work in a field where one continually has to make up new names for subjects.) I also pointed out that there were no examples of Pre-K1a10 in Scandinavia, while there were several in Ireland, supporting the theory that 16048A traveled from Ireland to Scandinavia. But now I'm found one Pre-K1a10 in a member's match from Sweden. The numbers of examples are so small that the frequency of those is slightly higher in Sweden than Ireland. But Ireland, Scotland and England (including those listed as Great Britain, United Kingdom, and British Isles) each had five examples. The highest frequency, representing only two examples, was from the Netherlands at 2.47%. Since Pre-K1a10 can only be detected by looking at HVR2 mutations, I had to use high-res results. (Even at that, one predicted Pre-K1a10 in the Project came back from a full-sequence test as a K1a4a1.) Therefore the raw numbers are smaller than for the 16048A cluster. I'll watch this group as the numbers grow. This map is at Pre-K1a10 map.

 

K1a + 524 Insertions: [Added April 15, 2007] The dataset for this map includes all Project members, and their exact matches, in K1a with one or more pairs of position 524 insertions, except for those already mapped in the K1a10 and Pre-K1a10 clusters above, all of which also have the 195C mutation. Others are in subclade K1a4a1 as determined by full-sequence tests; one of those has 195C. On the map, Denmark has the highest percentage; but that is only one person. Perhaps the map is showing a Mediterranean to British Isles via North Sea migration route. This K1a+524 map should be compared to the ones for Pre-K1a10 and Pre-K1a9.

 

I have also created a Fluxus phylogenetic diagram for this group, only including the Project members, not their matches. This diagram, from which I have excluded the position 309 insertions, exhibits a star pattern after the first pair of 524s, then a branching pattern after the second pair.

 

Ashkenazi and “Pre-Ashkenazi” Subclades: Dr. Doron Behar’s 2006 paper identified three K subclades which are primarily Ashkenazi Jewish: K1a1b1a, K1a9 and K2a2a. In general, the FTDNA customers in those subclades are usually descendants of fairly recent immigrants from Eastern Europe; so the percentages of the subclades in Eastern European countries are not necessarily those of the current or historical populations. But the relative percentages are likely representative of the distribution of those subclades between countries. I have also created maps for the “parent” subclade or cluster for two of these subclades. K2a is the most recent demonstrable parent of K2a2a, since the intermediate step, K2a2, is defined only by one coding-region mutation. “Pre-K1a9” is my name for the relatively small cluster of haplotypes which are in K1a, have 195C, but do not have any pairs of 524 insertions. I think an ancient one of these had the 16524G mutation and thus became the founder of K1a9. There is no map for the “parent” of K1a1b1a, since the next three steps back to K1a are defined by coding-region mutations only. Each of these maps is based on high-resolution results only.

 

The K1a1b1a map [updated on April 15, 2007] is based on 29 K Project members and 56 exact matches. Thirteen countries are included, although the Irish entry may be in another subclade. The highest percentage is for Belarus. The triangular pattern shown may reflect the migration from the Rhine Valley eastward, as suggested by Behar. There are also members who trace back to Syria and Uzbekistan in Asia. For the latter country, the member is the only one at FTDNA with high-resolution results. [The updated version, which adds some inadvertently omitted data, moves Lithuania to the second-highest percentage and also doubles the size of the small Germany circle. Hungary and Ukraine are increased slightly. I think the revised version increases the perception of the triangular pattern. I deleted the one Irish entry from the percentage list, since I doubt it belongs in this subclade.]

 

The Pre-K1a9 map is based on only six results, so it may not be as trustworthy as some of the others. It might suggest that this subclade followed its daughter, K1a9, as far as Poland; but it mostly went toward the British Isles. Another possibility is that there was a separate mutation, adding pairs of 524 insertions, thus creating the Pre-K1a10 cluster discussed above.

 

A triangular pattern may also be seen on the K1a9 map, with Austria again the apex and groups moving along to Romania and Belarus. K1a9 appears youthful, with only two slightly different haplotypes in the Project. The map is based on 23 European-origin examples.

 

The K2a map shows one of the largest, and oldest, subclades in K. There are 30 members in K. All are high-resolution by definition, since it is defined by HVR2 mutations. The map represents a total of 60 examples. The age of the subclade no doubt accounts for its wide distribution. In this case, some in K2a may have followed its daughter K2a2a as far as Poland and Hungary, but many of them headed northwest. Since this subclade is defined by two recurrent mutations, there may have been multiple founders. Also, I’ve previously suggested that examples may eventually be found with the coding-region mutations proving them to be “Pre-K1c”; since it only takes the addition of 498- to create a K1c. If that happens, I will expect some sort of prize!

 

The K2a2a map shows the smallest, least widespread, and probably the youngest Ashkenazi subclade. It is based on six members and a total of 16 entries. All the entries have the same haplotype. (There is a slightly different haplotype in MitoSearch.) There is a definite concentration in Belarus.

 

Comparing the three Ashkenazi maps, it appears that each subclade started in Austria and fanned out to the Northeast and Southeast. If the real start was in the Rhine Valley, Austria may have been their first stop.

 

K1b Subclades: K1b2 may be easily identified by the combination of the 146C and 195 mutations – and the lack of 497T. Of the three main subgroups under K1b1, only K1b1a is mapped. That subclade always has 16319A and 152C; most also have 16463G. K1b1c and one branch of K1b1b have defining HVR mutations, but none of those has shown up in the K Project. Both K1b1a and K1b2 have sequence with and without 524 insertions, which is unusual for K.

 

The K1b1a map shows that the subclade is not found in great numbers anywhere, but it is widely scattered. If Portugal looks significant; that’s only one person. (If the one person tracing back to Lebanon were mapped, the circle would cover most the lower left-hand corner.) This map is based on the low-resolution results of 12 Project members and a total of 33 entries, since the defining mutations are in HVR1.

 

The K1b2 map appears to show a more northerly distribution for this subclade as compared to its sibling above. Because it is defined by HVR2 mutations, the map is based on 15 high-res members. The total is only 21 entries, since most of the members don’t have any exact matches. The large Belgium circle, while the highest at 4%, is only one person.

 

K1c and K1c2: This set of maps is for subclade K1c2 and its parent K1c. For the two maps I used only high-res results. Any haplotype not represented in the K Project will not be included on the maps, so some countries mentioned in MitoSearch are absent. Those would include Russia, Slovakia and the Czech Republic.

 

The two maps are at K1c map and K1c2 map.

 

The first map is labeled K1c+, which simply means any haplotype in K1c except those in K1c2. K1c1 requires a full-sequence test to be accurately determined. By far the largest node is for Portugal, with 8.33% of the mtDNA entries in FTDNA's database in K1c+. However, the percentage represents only two entries. I don't think this map demonstrates an origin for K1c, but the wide distribution obviously reflects the greater age than for K1c2 in the second map. I've expressed my opinion before that the British Isles countries would have a significantly larger percentage except that many of our British ancestors came here so long ago that the maternal paper trail no longer exists. I've also pointed out the long-term political connections between England and Portugal, so perhaps those two from Portugal could be traced to England further back in time. That's just a guess.

 

The second map for K1c2 has fewer countries - Portugal is gone - demonstrating the relative youth of K1c2. K1c2 actually has 48 total entries compared to 38 for K1c+; many of those listed USA, Canada or Unknown and so are not represented on the maps. Denmark at 2% is highest on this map, but that's for only one person. An interesting fact is that for the modal haplotype (the six basic K mutations plus 16320T, 146C, 152C and 498-), 16 of 18 or 89% are from the British Isles. To me, that suggests a British founding of K1c2. In contrast, only six of 16 or 38% of the K1c+ modal entries are from the British Isles, leaving it's origin in doubt.

 

© 2007 William R. Hurst

Administrator, mtDNA Haplogroup K Project