42 Full-Sequence mtDNA Results and Subclades

From Haplogroup K Project

May 24, 2007

 

This CHART, as discussed below, represents the current 42 members, or 10%, of the mtDNA Haplogroup K Project at FamilyTreeDNA who have received results from full-sequence mtDNA tests. Two more tests are in progress. I published similar charts when there were eight and 42 test results available. The chart was produced using Fluxus-Engineering Network software, using both the Reduced Median and Median Joining algorithms. The input file was created using Tom Glad’s mtDNAtool.

 

On the chart, KROOT represents the ancestral K haplotype, as shown in this MitoSearch entry; so all mutations on the chart occurred since the founding of K. With two exceptions, I have used all of the HVR mutations on each individual’s K Project entry. Instead of using one, two or three pairs of the position 524 insertions I have used “524i” to simplify the chart. Also, to save room I omitted the relatively insignificant 309.1C and 309.2C insertions. However, the coding-region (CR) mutations used were not from the individuals’ full-sequence test results, since I don’t have access to all of those and some may contain medically-relevant information. Instead, I have used only the coding-region mutations which define the various assigned subclades from Dr. Doron Behar’s 2006 paper which contains the current “official” K tree. For this chart I follow the most recent practice of FTDNA to not use asterisks; so I marked K1a* and K2a* as simply K1a and K2a. This time I have tried to mark all back mutations. However, I have not put a subclade label on every node; if there is no label on a node, it will be found at the previous branching point.

 

From KROOT, the line to the right is one of the two major divisions of K, K2, which is defined by several CR and HVR mutations. There are three K2a entries. The addition of several CR mutations and 512C leads to one example of the Ashkenazi subclade K2a2a. No examples have appeared from K2b or K2c.

 

Mutations 1189 and 10398 define the other major K division, K1. New to this chart is the addition of sequence #46547 designated as just K1, since it does not have the defining mutations for any of the three major branches of K1. Besides that one, the three major subclades of K1 diverge from branching point mv8. The first shown is K1c, defined by 498- and two other HVR mutations. There are no examples of a plain K1c; if one existed it would be at mv9. K1c2 is defined by 16320T and three CR mutations. Three examples are shown. K1c1 is defined only by CR mutations; there is only one example so far.

 

K1b is defined by the CR mutation 5913. My substitute term “524i” for the length-heteroplasmic 524 insertions makes its first appearance here. Although the three examples here have these insertions, in reality they only appear in about half of K1b haplotypes. K1b2, defined by one HVR and two CR mutations, has two different haplotypes here. K1b1 is defined by three CR mutations, but here there is only one example of its lower subclade K1b1a. That’s defined by three more CR mutations and two HVR ones, mainly 16319A. Most of these, including the current example, also have 16463G. No examples of K1b1b or K1b1c have been tested in the Project.

 

K1a, defined solely by HVR2 mutation 497T, is about 60% of K. Here it is somewhat overrepresented at 71%. Branching off to the right from mv4 is the largest group K1a1, defined by 11914. There is one example of this subclade as well as one K1a1a, defined by one more CR mutation. The larger branch is that of K1a1b, defined by 15924. One example of the basic subclade has four additional HVR mutations. K1a1b1 is defined by adding 11470. Two separate examples based on Behar’s tree are shown, one of which has picked up the interesting mutation 114T. Beyond mv7 are six examples of the largest Ashkenazi subclade K1a1b1a, which is defined by two CR mutations and 16234T. Behar shows 114T below 16234T; it just happens that these five examples all have it. 16223T usually occurs in a higher percentage of this subclade than the one-of-six represented here.

 

Plain K1a is more or less a grab-bag of K1a haplotypes which don’t have the mutations to be in lower subclades K1a1 through K1a9. There are eight of those here. Five of them have 195C, a group, including K1a9, which includes about 18% of the Ks on MitoSearch as opposed to the 8.5% on Behar’s tree. I think the difference is the larger representation of British Isles and especially Ireland origins in MitoSearch than in the populations used by Behar. Three of these here at node 27061 have 16048A, which defines a large, mostly Irish cluster, to which I have given the temporary name K1a10. K1a9, the third Ashkenazi subclade, is defined by 16524G. It never has 524 insertions. Both K1a9 and K1a10 never have insertions at position 309, which is unusual for K at least. A better chart from my most recent MitoSearch survey just for the K1a+195C group may be found here. Two new plain K1a’s are at the node marked #84379 with HVR mutations 16129A, 16T, 150T and 199C. There are quite a few of these in the FTDNA database. These are probably the same as the unlabeled group on Behar’s chart between K1a1 and K1a9. I have suggested the designation K1a11 for them. (There is another, more radical solution to the problems caused by the two clusters with 16048A and 16T. The K1a9 label would be moved up to the 195C branching point, with the current Ashkenazi subclade K1a9 relabeled K1a9a and the 16048A cluster designated as K1a9b. Then the 16T cluster could be K1a10 instead of K1a11.)

 

Up from mv4, defined by four CR mutations is the large subclade K1a4a1, represented by seven different haplotypes, five with the 524 insertions, two without. This is the largest subclade which can’t easily be predicted from HVR results, since it has no defining HVR mutations. Two examples have 16261T; the jury is still out on its significance. Many sequences in K1a with 524 insertions, but not 195C, may be candidates for K1a4a1, as are those with 16245T.

 

A new subclade has shown up recently, to the right and down from mv4. K1a3a is defined by four CR mutations and represented by #78430. No examples have tested so far from subclade K1a2 or from K1a5 to K1a8. All those are defined only by CR mutations.

 

In general, the 42 tests are a fair representation of K as shown on Behar’s tree. The exceptions were discussed above. There are missing subclades, but consider that Behar’s tree was based on 121 full-sequence samples.

 

Anyone who has tested as a K is invited to join the K Project by clicking on the blue Join button on your FTDNA personal page. Additional information is available at our website.

 

©2007 William R. Hurst

Administrator, mtDNA Haplogroup K Project