Getting the Most Out of Autosomal DNA Analysis to Find Relatives

John Allred
Dublin, Ohio

            Autosomal DNA analysis has become very popular during the last few years, thanks in large measure to marketing by the three firms who do almost all of that type of test. Those three companies are Family Tree DNA (“Family Finder”), AncestryDNA , and 23&Me. These companies give you all kinds of reasons to have your autosomal DNA tested but, for genealogists, there is only one reason: find family! You can increase your success rate by a very simple act: Share your information online about your family’s ancestral paper trail connected to your autosomal DNA results, and hope that others share their information with you.

Members of our Allred family have had remarkable success using Y-chromosome DNA to confirm kinships and solve mysteries. But Y-chromosome data are relatively easily interpreted. It helps that the Y-chromosome, like the family surname, is passed from father to son, so the Allred name is itself a major clue. Interpretation of Y-chromosome results is simple: If you have enough markers analyzed (at least 25), you can compare your numbers with other Allred men and if they match or almost match, you have a relatively recent common ancestor and if they are not a close match, you don’t. Autosomal DNA is not like that – there are numerous false positives (in which you think you are related but you are not) and false negatives (in which you think you are not related but you are). While connecting information from Y-chromosome analysis to a paper trail is helpful to establishing a relationship, in the case of autosomal DNA, it is absolutely essential. My purpose here is to help you connect your autosomal DNA to your pedigree so others with whom you may be related can find you. It is also the purpose to show how you can use the information available to find real matches and rule out “false positives” or “false negatives”.

            First, let me explain the difference between Y-chromosome DNA and autosomal DNA. Humans have 23 pair of chromosomes located in the nucleus of each cell. One pair is called the “sex chromosomes”, labelled XY in males and XX in females. The other 22 pair is called “autosomes” and the DNA in them is called “autosomal DNA”. In these 22 pairs of autosomal chromosomes, half of the DNA came from the father and the other half came from the mother. In theory, 25% of the DNA came from each of the four grandparents, 12.5% from each great-grandparent, etc. However, in practice, as you go back through the generations, the actual amount inherited from each ancestor can be quite variable due to a complication of the process known as “recombinant” DNA. That is one reason that autosomal DNA is almost impossible to interpret without additional genealogical information.

It is clear that autosomal DNA analysis works best by confirming a relationship instead of finding one. This is stated by ISOGG (International Society of Genetic Genealogy), “Autosomal DNA tests can be used to confirm relationships with a high level of accuracy for parent/child relationships and all relationships up to the second cousin level. For all relationships other than parent/child relationships additional contextual and genealogical information is required to confirm the nature of the relationship.” (see http://isogg.org/wiki/Autosomal_DNA). Suppose a body was found and police suspected that he is a very close relative (child, sibling or first cousin) of Jane Doe. An autosomal DNA test would quickly confirm or refute the relationship with very close to 100% accuracy. But if the police had no idea of the identity of the dead person, using DNA to find out who he is would be an almost impossible task in the absence of any other information about the victim’s family.

Most of us most of the time are looking to find relatives, not just to confirm what we already know or think we know. For those of us using autosomal DNA to find relatives, there are ways to maximize our chances of locating someone related to us. Notice that the statement above from ISOGG emphasizes the importance of additional genealogical information for autosomal DNA analysis to be effective in finding relatives. The key is this: Put your genealogical pedigree connected to your autosomal DNA online where others who have had the autosomal DNA tested can find you. If we all do that, it will be a lot easier to find each other.

One way you can link your genealogy data to your autosomal DNA results is through a free web site called GEDMatch, regardless of which company you used for your autosomal DNA analysis. [How to do that is described in the TEXT BOX.] In fact, GEDMatch is the only way for those who used AncestryDNA or 23&Me to do autosomal analysis and connect their results with their family’s genealogy tree. Those who used Family Tree DNA should load their family history (GEDCOM file) into GEDMatch but they should also use the Family Tree DNA website to connect their DNA results with their family tree as described below.

After you have loaded your DNA data and your GEDCOM into the GEDMatch system, go to GEDMatch site (https://www.gedmatch.com) and log in. On the home page that pops up, go the section “Analyze your data” and click on “One-to-many-matches”. Click on your name or enter GEDMatch kit number, select “Autosomal” and then “Display results”. After a short wait, you are confronted with an incomprehensible table with 2000 names and email addresses. Look specifically at the heading “Autosomal” in the center of the page and then to the column marked “Total cM”. These cM[1] values decrease as you move down the page. If the first few rows have cM values in the 1000s, you should know the persons because he/she is your parent, grandparent, sibling, aunt/uncle or double first cousin! If the cM value is in the hundreds, the person is very likely your first or second cousin. If the cM value is below 100, as most of them are, they may or may not be related to you. Back to the GEDMatch home page, you will find “One-to-one compare”. With it, you can compare your kit with that of others to see a graphic depiction of which chromosomes, if any, you match with segments longer than 7 cM. Note however, knowing which chromosome your match occurs has no meaning of whether or not you are related.

More information can be obtained from clicking on “GEDCOM + DNA matches” under the heading of “Genealogy” on the GEDMatch home page. Put in your GEDMatch kit number and click “search”. A very large table pops up, organized with the largest cM at the top with shared cM from 10 to 2000. Click on the number under the heading “GEDCOM ID” and you can see the person’s pedigree or descendants. You can search their pedigree for one of your ancestral names or use the “search” function to type in information on one of your ancestral names. As you can see, this is a long, laborious task because there are a lot of names and GEDCOM ID in this table. If you do this, start at the top because the probability of a relationship with you decreases as you go down the list.

 If you are using Family Tree DNA (“Family Finder”), you are given the opportunity to upload your paper trail (GEDCOM file) under the heading “Family Tree” as shown in TEXT BOX. It is also very important to list your family surnames under your profile. If you go to your results page on Family Tree DNA, you will see why both of these are important. Click on “Matches” under “Family Finder” and a table of information comes up. On the left are names of possible matches. On the far right, you will find “ancestral surnames”. If you find one or more of your ancestral names there, this may be a clue that you are related by way of one or more of these families. To check further, click on the small blue icon that looks like a pedigree chart next to the person’s name to see their paper trail. Can you find a common ancestor with them? If so, you are related and you should be able to quickly find how you are related (1st cousin? 2nd cousin? 3rd or 4th cousin?). If you do not find a common ancestor, does this mean that you are not related? Not necessarily because they may not have shared all of their information or you may be dealing with a very common name. (e.g. such common names as Smith or Jones may come from many different DNA families and are not related within genealogical time frame). You can also find other potential relatives by typing in a family name in the search engine, labelled “Search name or ancestral surname”. This will bring all of the people on the list with that surname somewhere in their pedigree to the top of the “ancestral surname” column.

 You may find it most frustrating if the “ancestral name” column is blank, which means they did not share their ancestral list on their profile. Equally frustrating is when the small “pedigree icon” next to their name is grey, not blue. This means that they did not include their family tree. In either case, no ancestral surname list or family tree pedigree, you cannot be sure you are related, much less figure out how.

How many 1st through 5th cousins do you have? How many of those cousins can you expect to find with autosomal DNA analysis? Since autosomal DNA is diluted by half each generation, the furthest back you can hope to find relationships is 5th cousin. If you had all of the information on your relatives, you could count how many cousins that would be but for almost all of us, we must rely on estimates. Simple calculations (https://www.quora.com/What-is-the-average-number-of-DNA-relatives-for-a-23andMe-customer) would show you have 729 relatives back to your 5th cousin if you assume that 3 children reproduce each generation. That number jumps to 4096 if you assume 4 children reproduce each generation. The real number is likely less than 4096 because families have gotten smaller during last century. Before that, families were larger but fewer children survived long enough to reproduce.

            The International Society of Genetic Genealogists (see http://isogg.org/wiki/Cousin_statistics) used an even higher number of total 1st through 5th cousins but found far fewer than that could actually be detected with the methods that the three major testing firms use. The second column inTable 1 shows the chances of detecting[2] a cousin from autosomal analysis. Note that the chance of detection diminishes substantially by the 5th cousin. This table means that if you had 100 actual fifth cousins standing in front of you, your autosomal DNA would match only 15 of them! Thus, even if you have 4,700 5th cousins, the analysis would detect only 700 of them.

The total detectable from 1st through 5th cousins is 1346 if all of your cousins were tested. But obviously not everyone has had their autosomal DNA analyzed. In fact, as of November, 2016, only about 4 million people are in the data base of  23&Me (1,200,000), Family Tree DNA Family Finder (275,000) and AncestryDNA (2,500,000). This is 4 million out of a total of 326 million Americans or about 1 in 80 people. If the Allred family is typical in terms of having the number having autosomal DNA analysis (and there is no reason to think we are not), this means that you only have about 18 1st through 5th cousins who can be detected by autosomal analysis and have been tested. Both GEDMatch and Family Finder give you thousands of names, but the statistical probability is that you are related to only a few of those.

Bottom line. Autosomal DNA analysis works best to confirm that you are related to a specific, known person. But finding relatives with autosomal DNA data is much more difficult, especially those who are more than three generations back. Autosomal DNA analysis can effectively detect a relative back to the 3rd cousin level but recognize that most of your relatives have not even been tested. We can do little to affect the percentage of our family who gets an autosomal DNA analysis but we can improve the usefulness of the results of those who have. Remember, getting the DNA results is only Step 1. To be helpful to understanding our genealogy, Step 2 must be to share online your autosomal DNA results connected to your family tree.

Table 1

Cousins Expected Number of Cousins Chance of Detecting % Detectable Number of cousins Detectable and tested
1st 8 100 8 1
2nd 38 100 38 1
3rd 190 90 170 2
4th 940 46 430 5
5th 4,700 15 700 9