MTDNA HAPLOGROUP K SURVEY AT 403 MITOSEARCH ENTRIES – GEOGRAPHICAL CONSIDERATIONS
On March 9, 2006, I published a survey of the 403 mtDNA haplogroup K on FamilyTreeDNA's MitoSearch, which included 181 non-duplicated high-resolution HVR1 plus HVR2 entries. Since then I have looked at the same data from geographical and other perspectives, resulting in this report and several additional Microsoft Excel charts. The first new chart is sorted first by geographic region, then by country of origin. Remember that in MitoSearch, the country of origin may be just a guess. If I tend to use the past tense below, it’s because K entries have already increased to 442; but the new ones will just have to wait for a “K500” survey. I will make frequent references to the recent 2006 paper by Dr. Doron Behar, et al., which has a K chart apparently being used by FTDNA to determine K subclades.
By geographic region, the 181 entries are divided as follows:
Germanic countries: 11 – 6.1%
Unknown: 42 – 23.2%
HVR1 Mutation 16320T and HVR2 Deletion at Position 498
Of the 181 entries, 26 or 14% are marked in yellow on the K403 chart. A new chart has only those entries. 10 of those entries had just 498-, which should indicate Dr. Behar’s K1c subclade. 14 entries added 16320T, which should put them into K1c2. Note that all 24 also had 146C and 152C, which are required for those subclades. There were only two entries with just 16320T. One, VX6P5, which listed 497- was an error which has now been corrected to 498-. The remaining entry had 16320T, but was missing 498- and 152C. It also had two of the 524 insertions discussed below, which otherwise don’t appear in this chart. Therefore, in my opinion, the 16320T mutation is a “personal” or non-defining mutation for this entry, so it probably does not belong to the K1c or K1c2 subclades.
Some of the K1c entries may belong in K1c1, but that subclade is defined by coding-region mutations outside HVR1/2 and thus is not determinable by using MitoSearch. These subclades, and British K’s in general, are not well represented in the full-sequence samples used by Behar.
Looking at the countries of origin, these entries at first
glance look “Scots-Irish.” Well, most of the Irish entries were actually from
the
In Europe, the K1c subclade appeared mainly in
HVR1 Mutations 16223T, 16234T and 16524G and HVR2 Mutation 512C
Dr. Behar found Ashkenazi Jewish samples in three subclades of K: K1a1b1a defined by 16234T, with a lower unnamed group defined by 16223T; K1a9 defined by 16524G; and K2a2a defined mostly by coding region mutations, but also by 512C. Each of the 32 entries (17.7%) with one or more of those mutations is marked in green on this new chart. Behar has the mutation 114T after 16234T but before 16223T; but on MitoSearch this mutation was present in all the entries, even with just 16234T. He also has 195C before 16524G in K1a9. Those two mutations, 114T and 195C, also appeared quite often without the “Ashkenazi” markers on MitoSearch. (This is a good place to mention that there are K’s with these mutations on MitoSearch and in the mtDNA Haplogroup K Project who are not Jewish and have no known Ashkenazi Jewish ancestors. I try to stay out of any arguments about whether the mutations predated the religion or not.)
On MitoSearch there were 18 examples (9.9%) with 16234T; 7 (3.9%) of those also had 16223T. Only 1 entry (0.6%) had 16223T without 16234T. (I did note that there was another one like that, but without HVR2 tested.) This may have been caused by a back mutation. There were 5 examples (2.8%) of 16524G and 8 examples (4.4%) of 512C. These latter two mutations never occurred together or with the first two.
I have included as Eastern European countries
HVR2 Mutations 133G, 174T, 323G, 375.1C, 557T and Back Mutations at 73G, 114T, 263G, 315.1C, 497T
Since two of the Ashkenazi subclades are in the major subclade K1a, they all should have had 497T. But in fact, 3 did not. Instead they had 133G, 323G, 375.1C, and 557T. The two with 16234T also had 174T. The 3 were also missing the common mutations 73G, 114T, 263G and 315.1C. In fact, it appears that these “odd” entries have had 5 back mutations. In addition, there were 4 more “odd” entries which had the 133G, etc., mutations and the back mutations; but were not Eastern European and did not appear to be Ashkenazi. There was even 1 with the back mutations, but without 133G. All these with similar mutations, Ashkenazi or not, I call the “odd” haplotype cluster. A Google search for these mutations produced nothing, except for one of my previous surveys. Interesting note: 16224C, 16311C, and 16519C were never subject to back mutations in any of these entries. Entries with the 133G, etc., mutations are marked in aqua on the K403green chart and a separate K403odd chart.
HVR2 Position 524 Insertions
Dr. Behar does not use the insertions at HVR2 position 524 to create the K subclades. However, in looking at the entries on MitoSearch, I believe there is a definite geographical aspect to these insertions. First, they always appear in pairs, alternating between A and C nucleotides and beginning with either letter. The four patterns, with their counts and percentages on MitoSearch are:
524.1A, 524.2C-----------------------5 – 12.5%
524.1A, 524.2C, 524.3A, 524.4C-----3 - 7.5%
524.1C, 524.2A----------------------18 – 45.0%
524.1C, 524.2A, 524.3C, 524.4A----14 – 35.0%
I extracted all the entries with these insertions into a new chart. The first thing I noticed was that of the colors I had used in my original K403 chart only blue appeared on this one, with one exception. The blue denotes entries with the HVR2 mutation 497T, which Behar uses to define the K1a subclade. Of the 40 entries (22% of the total) with the insertions, 31 had 497T. There were no entries in green, which would have suggested an Ashkenazi origin. The one entry with 16320T and marked in yellow was discussed above. The geographical spread was wide, with no great difference whether 497T was present, except that there were no Eastern European entries, which would have been mostly Ashkenazi. These insertions never occurred in conjunction with mutations 16223T, 16234T, 16270T, or 16356C
There also did not seem to be a discernable difference in
the origin of the series of two or four insertions or series beginning with A’s
or C’s. A question has been asked about whether there is a real difference in
the two sets of insertions beginning with A or C. One study by
HVR2 497T “Only”
Obviously, I don’t mean these entries only had the 497T
mutation. These had 497T, but not any of the other mutations mentioned above.
They should all be in subclade K1a, but it’s more difficult to get them into a
lower subclade without testing coding-region mutations. There were 73 of these
or 40.3% of the total. 32, or 17.7% of the total and 43.8% of the “blues” were
from the British Isles,
Haplotypes without Defining Mutations
These haplotypes have a similar breakdown to the “blue” ones
above. There were 43 of them, 23.8% of the total. 17 of them, 9.4% of the
total, or 39.5% of this group were from the British Isles,
Who are U? – Haplotypes with 16270T or 16356C Mutations
When Bryan Sykes named the “Seven Daughters of Eve,”
“Katrine,” the founder of haplogroup K, was on an equal level with the other
six European mtDNA founding mothers. However, most charts now show K as part of
a superhaplogroup
Summary
I recently heard someone say that mtDNA mutations didn’t
mean anything by themselves; they were only useful when compared to those of
other persons. The more I look at the K haplotypes, the more I know that
statement not to be true. In K at least, a quick look at a person’s list of
HVR1/2 mutations can in many cases give a good indication of where his or her
direct maternal ancestors came from. There are parallel and back mutations, to
be sure, so any one mutation will never tell the whole story. In the
not-too-distant past, there were two major subclades of K defined by HVR1
mutations 16093C and 16320T. (That presented a problem for me, since I’m the
only one on MitoSearch with both of those.) That changed with the addition of
HVR2 to the mix and has now changed further with the addition of coding-region
mutations. Unfortunately, the latter are not easily available and are still not
completely representative of the world’s K population. MitoSearch has a better
representation from the British Isles, but has its own limitations – no
coding-region mutations; “by hand” entries leading to typographical error;
duplications; and an over-representation of the
William R. Hurst