FTDNA Genetic Distance

Anything above this line is advertising and is not a link to information on my web site
DNA Project notes
GKBopp

FTDNA Genetic Distance & Family Group Assignments

Revised 2 August 2005
GKBopp

[Set printer to Landscape]

Bottom Line:
FTDNA uses more than one method to calculate genetic distance. They use a conservative method when sending match notifications and listing matches on the participant's personal page. Their FTDNATiP™ and ground rules for Interpreting Genetic Distance use a different method; that is used by your group administrator to help determine the likelihood that a participant belongs in a family group.

In a group project the decision to place participants in a family group is made by the project administrator - it is not something that can be done by Family Tree DNA (FTDNA). The following discusses genetic distance calculations and explains why some participants have no matches on their personal page at FTDNA but have been placed in a family group by the project administrator. This discussion uses results from 25 marker tests but the basic concept applies to additional markers. The results of three participants in the English Surname DNA Project (referred to by the first names of their earliest ancestors) are used in the explanation. This discussion does not address the meaning of matches and time to most recent ancestor (see Results - Meaning).

In the below example, the English surname project administrators have placed George in the family group shared by Aaron and Sampson. However, George did not receive any match notifications, no matches appear on his results page, and he does not appear on the personal pages of Aaron and Sampson.

CHART 1 - Results
The official English Web Site is at http://www.englishdna.com/

Participant Code

                 DYS #==>

Surname is ENGLISH. First name of participant's of earliest known ancestor is used below.



Ha
pl
o

3
9
3

3
9
0

1
9

3
9
1

3
8
5
a

3
8
5
b

4
2
6

3
8
8

4
3
9

3
8
9
|
1

3
9
2

3
8
9
|
2

4
5
8

4
5
9
a

4
5
9
b

4
5
5

4
5
4

4
4
7

4
3
7

4
4
8

4
4
9

4
6
4
a

4
6
4
b

4
6
4
c

4
6
4
d

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Locus ==>

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

 

 

 

 

 

 

 

 

 

 

 

 

 

E 1

Aaron 1828 AL

I

13

22

14

10

14

14

11

14

11

12

11

26

14

8

9

8

11

23

16

21

29

12

15

16

16

 

 

 

 

 

 

 

 

 

 

 

 

 

E 2

Sampson 1790 NC

I

13

22

14

10

14

14

11

14

11

12

11

28

14

8

9

8

11

23

16

21

29

12

15

16

16

 

 

 

 

 

 

 

 

 

 

 

 

 

E 5

George 1801 NC

I

13

22

14

10

14

14

11

14

11

12

11

29

14

8

9

8

11

23

16

21

32

12

15

16

16

 

 

 

 

 

 

 

 

 

 

 

 

 

The three above participants have been assigned to English Family Group 1 but do not share a paper trail at this time. There is not enough information yet to estimate the ancestral haplotype. The markers in white blocks indicate values that differ from the most frequent value. According to FTDNA, the red markers show a faster mutation rate than average. However, managers of large surname studies report variations that differ from those in a mixed random group.

None of the above has a 25 marker match at FTDNA with a surname other than English; however, E2 has several non-surname matches with the first 12 markers.

 

GENETIC DISTANCE

It is expected that in a surname study, a family group will have members whose markers don't match exactly due to mutations. However, what those mutations really mean is not yet known. This field is in its infancy - we are all pioneers - and much is subject to change as more is learned. This is especially true when discussing mutations and using the terms "match/es" and/or "genetic distance." There is more than one way to determine genetic distance and this causes a lot of confusion for participants.

INFINITE ALLELE Method - any difference counts as one

One can look at Chart 1 and see that Aaron and Sampson match on all but one marker - they have a 24/25 match. One can also see that George matches them on all but two markers - a 23/25 match. In the infinite allele method these matches are expressed as a genetic difference of 1 between Aaron and Sampson (25-24=1) and a genetic difference of 2 between George and the others 25-23=2.

Although the DYS # 449 allele values for Aaron and Sampson are 29 and Aaron's is 32, in the infinite allele method, that difference is counted as one.

When this method is used, the terms matches and mis-matches tend to be used rather than the term genetic distance.

STEP-WISE Method - exact differences are counted

In Chart 1, DYS # 449 the allele values (the numbers) for Aaron and Sampson are 29 and Aaron's is 32. In the step-wise method, that difference is counted as three (32-29=3). 1

FTDNA'S Genetic Distance Calculations 2

FTDNA Notifications and Matches on Personal Pages - The "step-wise" method is used on most of the markers; however, the "infinite allele" method is used on a few others. See http://www.ftdna.com/trs_gendist.html and examples at Genetic Distance - FTDNA Calculation).

This "hybrid" calculation is used by FTDNA to determine who will appear as a match on participant's personal page and when email notifications are sent. It is also referred to in FTDNATiP™ reports and appears on the group administrator management reports. In the case of the participants in Chart 1, this method results in:

A genetic distance between Aaron and Sampson of 2.
A genetic distance between George and Sampson of 4.
A genetic distance between George and Aaron of 6.

Because FTDNA uses this more conservative calculation to determine when notices are sent and matches are posted on personal pages, George does not appear as a match on the personal pages of Aaron and Sampson (nor do they appear on his page) and no notices were sent.3

FTDNATiP™ - While these reports refer to genetic distance based on the above calculation, they use the infinite allele method when referring to the number of matches/mismatches. Hence, on the TiP reports Aaron and Sampson have "1 mismatch " with each other and "2 mismatches" with George. In the case of George's FTDNATiP™ comparison to Aaron and Sampson, in addition to reporting the "2 mismatches" the report also mentions the genetic distance of 4 from Aaron and 6 from Sampson.

Unfortunately, although this method reports only "2 mismatches," George has no access to this information because the more conservative calculation prevents Aaron and Sampson from appearing on George's personal page and he has no access to the TiP report.4

FTDNA Ground Rules for Interpreting Genetic Distance - FTDNA's ground rules for interpreting genetic distance use the infinite allele method. According to those ground rules, George is "probably related" to Aaron and Sampson. In other words, based on the ground rule chart, George should consider his 23/25 "match" (or his "2 mismatches") as a "distance" of 2. (See Interpreting Genetic Distance at http://www.familytreedna.com/gdrules_25.html )

Chart 2
SUMMARY OF INFORMATION FOR THESE PARTICIPANTS
Aaron (E 1) - Sampson (E 2) - George (E 5)

 Method ==>
----------------------Participants

Infinite Allele
(Chart 1)

FTDNATiP™

Ground Rules

"Hybrid" Genetic Distance

Personal Page Match &
Notice sent

George & Aaron

23/25

 2 mismatches

 Distance 2

6

   No

George & Sampson

23/25

2 mismatches

 Distance 2

4

   No

Sampson & Aaron

24/25

1 mismatch

 Distance 1

2

   Yes

 

GROUP ADMINISTRATORS DETERMINE FAMILY GROUPS - FTDNA DOES NOT

FTDNA does not assign participants to family groups. Only project administrators do this. In this case, the group administrator has placed George into Aaron's and Sampson's family group because the genetic distance based on the infinite allele method is only two and this is the method used in FTDNA's ground rules for 25 markers when participants share the same surname. According to those ground rules, George is "probably related" to Aaron and Sampson . In other words in the ground rule chart, George should consider his 23/25 "match" as a genetic distance of 2.

The administrator also knows that although none of these men share a paper trail at this time, George's earliest known ancestor was born ca. 1801 in NC and that one of his 23/25 matches also has an ancestor born in NC ca. 1790. The same locale and close time frame also make it likely that the two are some how related.

=================
Notes

1. This field is still too new to know yet if a difference of 3 on a set of markers (in this case DYS # 449) tells us that the mutation happened in one transmission event (birth) or if it happened in two or three. If it happened in two or three the current information is that the shared common ancestor was much, much, much further back in time than if it happened in one event (birth). Researchers are trying to establish if some markers mutate faster than others. For example, FTDNA's information indicates that DYS 449 does mutate faster than some of the other markers. Group administrators in other studies report that the mutation rate may vary within actual families.

2. Contributing to the confusion is that when FTDNA uses the term genetic distance, they are usually referring to the results of their more complex "hybrid" calculation. However, that is not always the case, sometimes they use the term genetic distance or "distance" when using the infinite allele method (for example, in their ground rules for Interpreting Genetic Distance at http://www.familytreedna.com/gdrules_25.html ).

3. As of July 2005, matches listed (and notices emailed) by FTDNA are based on the genetic distance determined by FTDNA's formula (not the infinite allele method). Mailings are sent based on the following:

12 markers - genetic distance of 0 and genetic distance of 1 if surname is the same.
25 markers - genetic distance of 0 - 2.
37 markers - genetic distance of 0 - 4.

Reminder: Matches, notifications, etc. appear only for participants who have signed release forms and who are not restricted only to others in their project (a code participants must change on their personal preferences page in order to see matches with the entire FTDNA database - of all others who have signed a release form and have removed the restrict code at their personal page).

4. FTDNA does provide a report to group administrators comparing each participant's genetic distance to everyone in the project and include links to Tip data. However, because George's genetic distances (based on the "hybrid" method) to Aaron and Sampson were more than 2, he has no access to the TiP report for these participants. George, and the others, would never know of their likely relationship without the assistance of the project administrator. FTDNA is aware that this is a problem and hopes to find a way to resolve it in the future.

See also:
http://www.ftdna.com/trs_gendist.html (off site)
http://nitro.biosci.arizona.edu/ftDNA/Distance.html (off site)
Genetic Distance - FTDNA Calculation
Meaning Of Matches And Near-Matches Of Test Results
Interpreting Genetic Distance - http://www.familytreedna.com/gdrules_25.html (off site)
http://blairgenealogy.com/dna/FTDNATiP.html (off site)

DNA Project notes
GKBopp
Anything below this line is advertising and is not a link to information on my web site