November 24, 2017

Where Did I Come From? Understanding DNA Ethnicity Estimates

DNA ethnicity estimates are fun conversation-starters. But the “pie charts” become more meaningful genealogically when you can assign timelines to the places your ancestors were from. Here’s an update to our ongoing conversation about what DNA ethnicity results really mean.

Understanding DNA Ethnicity Estimates

Where did I come from? This is a fundamental human question, and it is driving millions of individuals all over the world to have their DNA tested. We genealogists would like to think that they are being tested to aid their family history efforts, or to connect with us, their cousins. But they aren’t. They are after that pretty pie chart that tells them what percentage of themselves came from where.

Now, I know you have heard me say that these kinds of results are just for fun, and don’t hold much genealogical value. But due to some interesting developments in the world of DNA, my previous assertions of these ethnic origins results being somehow second class to our match list might be changing.

Living DNA and DNA Ethnicity Results

A U.K. company called Living DNA launched their DNA product in the fall of 2016. Right now, all they are focusing on is reporting ethnic origins information. But they are doing it in a manner that changes the way we look at our DNA ethnicity results.

In addition to the standard map that you will see at any genetic genealogy company, Living DNA also offers a tool they call “Through History.” It literally takes you step-by-step back in time to show you how similar your DNA is to others on earth during 11 time periods ranging from 1,000 years ago to 80,000 years ago! In the images shown below, we see a glimpse into my earliest time period, a peek at the middle, and a view of the last. The intensity of the blue on the chart tells you how genetically similar I am to the people in that area.

In the first chart shown here, you can see that since I am 100% European, I share DNA with, well, people from Europe:

But, if we go back not very far, I am sharing DNA with people in the Middle East and Russia, as shown in the second map:

As my DNA marches further back in time I can see that I am sharing that DNA with people in a variety of locations until we get back to the beginning of man, and I am sharing DNA with literally everyone in the world.

DNA Ethnicity Estimates Over Time

So, how does this work from a DNA standpoint? Well, the fact is, not all DNA markers are created equally. Some markers have developed relatively recently on our timeline making them helpful for determining recent relationships and modern populations. Others have been around longer, linking us to early settlers of Europe or even Asia. Still others link us together as a human race and help to track our origins back to a single time and place.

Part of the struggle that these DNA testing companies have is trying to figure out the time and place for each of the markers they test. Certainly part of the puzzle is the ability to look not just at modern day populations, but ancient populations.

You may have heard of some recent reports that scientists have completed DNA testing on ancient remains. One example came from Ireland where they were able to determine that one body tested had ancestry in the Middle East, and another had roots in Russia. It is the combined efforts of both ancient DNA testing and your own modern samples that unite to help us improve our understanding of our own personal origins, as well help us understand how humankind developed and evolved.

3 Ways to Better Understand Your DNA Ethnicity Estimates

To get the most out of your genetic genealogy populations report, you may want to:

  1. View your results in the context of a more historical timeline, as opposed to your own genealogical timeline.
  2. Try testing at multiple companies (you can transfer into Family Tree DNA from 23andMe or AncestryDNA for only $19. Click here to see recent updates to Family Tree DNA’s ethnicity categories.)
  3. Give the multiple population tools at Gedmatch a try, just to get a better feel for how different companies and tools can provide us a different look at the populations we are carrying around in our DNA. My quick guide for using Gedmatch, shown here, is available as a printed guide or digital download.

Keep Reading about DNA Ethnicity Estimates

“Results May Vary:” One Family’s DNA Ethnicity Percentages

Gedmatch: A Free Tool for Your DNA Results and Genealogy

New DNA Ethnicity Charts: Display Your Heritage

“Results May Vary:” One Family’s DNA Ethnicity Percentages

DNA ethnicity results may varyFour members of a family–mom, dad, daughter and son–tested with AncestryDNA. Their DNA ethnicity percentages vary. Why?

We talked recently about the limitations of the ethnicity results delivered by the DNA testing companies. We concluded that for the most part, these admixture results are like that short film before the actual feature presentation. That feature presentation in this case is your genealogical match list.

But, the short film is entertaining, and certainly keeps your attention for a while. It can be especially interesting when you have several members of the same family tested so you can compare their ethnicity results. Let’s consider the real AncestryDNA test results of a family we’ll call the “Reese family:”

DNA ethnicity results Reese

The Reese family has something to their great advantage: their family history indicates that they are mostly from Western Europe. This is a big advantage in the climate of today’s ethnicity results as all of the testing companies have far more data in their reference populations pouring out of western Europe than from anywhere else. That means that in general, they are going to be better at telling you about your heritage from Ireland or England than they are at discovering that you are from China or India.

Looking at the ethnicity results for the Reese family, we are tempted to start applying our knowledge of DNA inheritance to the numbers we see. We know that each child should get half of their DNA from their mom, and half from their dad. So our initial reaction might be to look at the dad’s 40% Scandinavian and mom’s 39% Scandinavian and assume that the child would also be about 40% (20% form dad, 20% from mom). You can see that the daughter did in fact measure up to that expectation with 37% Scandinavian. But the son, with only 25%, seems to have fallen short. The temptation to consider the daughter as the far better example of familial inheritance is strong (especially for us daughters, who are so often exceeding expectations!), but of course inaccurate. It is actually very difficult to look at the parent’s numbers alone and estimate the percentages that a child will receive.

Remember that these numbers are tied to actual small pieces of DNA we call SNPs (snips). Of the near 800,000 SNPs evaluated by your testing company, less than half of them are considered valuable for determining your ethnicity. The majority of the SNPs tested are working to estimate how closely you are related to your genealogical cousin. A good SNP for ethnicity purposes has to be ubiquitous enough to show up in many individuals from a given population, but unique enough to only show up in that population, but not any others. It is a difficult balance to strike.

But even when good SNPs are used, it is still difficult for the computer at your testing company to make accurate determinations about your ancestry. Take the Great Britain line in the Reese family data, for example. The dad has 21%, the mom 7%, so it would follow that the largest amount any child could have would be 28%.  We see the daughter (of course!) falling well within that range at 10%, but the son is seven points above at 35%. How does THAT happen?!

Let’s go back to some basic biology. Remember that you have two copies of each chromosome, one from mom and one from dad. These chromosomes are made up of strings of letters denoting the DNA code. That means that at each SNP location you report two letters- one from mom and one from dad. As the testing company is lining these letters up for comparison, they have to decide which letters go together- which are from the same chromosome- which set came from one single source. To illustrate how this works, let’s say we are trying to write two sentences: “The brown dog ate the bone” and “A black cat scared a mouse” where each word in each sentence represents a SNP. However, all the computer sees are two words, it doesn’t know which word goes in which sentence. You can see in this example that it would be fairly easy to get it wrong. Mixing up a couple of the words creates entirely new sentences with very different meanings. The process of determining which set of values goes on which line is called phasing. Often the inconsistencies you see in your DNA test results, weather it be in the matching or in the ethnicity, are because of problems the company has with this very difficult process:

Phasing correct:

DNA phasing correct

Phasing incorrect:

DNA phasing incorrect

While it is not valuable genealogically (for research purposes) to have children tested when both parents are tested, it is fun to compare notes to see who got what. And it can be an excellent way to involve members of the family who are curious about his genetics but not interested in genealogy, per se (like the “Reese” family’s son). We can also very tangibly see the fleeting nature of this DNA stuff. We can see that in one single generation this family has lost all traces of ancestry to several world regions. This really highlights the value of having the oldest generation of family members tested, to try to capture all that they have to offer in their DNA code.

DNA quick guides super bundle of 7Get more genetic genealogy help from one or more of my DNA Quick Guides. Try the “Getting Started” guide or the AncestryDNA guide (for autosomal tests like the Reese’s). Or save a bundle on the bundle pack: all 10 guides!

Come visit me at YourDNAGuide.com if you’re ready for a one-on-one consultation: consider me your travel guide for your DNA journey.

DNA Ethnicity Results: Exciting or Exasperating?

DNA ethnicity results

Wikipedia Commons image. Click to view.

Are your DNA ethnicity results exciting, confusing, inconsistent, exasperating…or all of the above?

Recently Kate expressed on the Genealogy Gems Facebook page her frustration with her ethnicity results provided by AncestryDNA. She gets right to the point when she writes, “the way they refer to the results is confusing.”

Kate, you are not alone. Many genealogists have been lured into taking the autosomal DNA test at one of the three major DNA testing companies just to get this glimpse into their past. Remember that the autosomal DNA test can reveal information about both your mother’s side and your father’s side of your family tree. Many take the test hoping for confirmation of a particular ancestral heritage, others are just curious to see what the results will show. Though their purposes in initiating the testing may vary, the feeling of bewilderment and befuddlement upon receiving the results is fairly universal.

Kate has some specific questions about her results that I think most will share. Let’s take a look at a couple of them. First up, Kate wants to know if our family tree data in any way influences the ethnicity results provided. The answer is an unequivocal “no.” None of the testing companies look at your family tree in any way when determining your ethnicity results. However, the results are dependent on the family trees of the reference population. The reference populations are large numbers of people whose DNA has been tested and THEIR family history has been documented for many generations in that region. The testing companies compare your DNA to theirs and that’s how they assign you to an ethnicity (and place of ancestral origin?).

Next Kate asks, “Do they mean England when they report Great Britain?” Or to put it more broadly, how do these testing companies decide to divide up the world? All of the companies handle this a little bit differently. Let’s look at Ancestry as an example. When you login to view your ethnicity results, you can click on the “show all regions” box below your results to get a list of all of the possible categories that your DNA could be placed in. These 26 categories include nine African regions, Native American, three Asian regions, eight European regions, two Pacific Island regions, two West Asian regions, and then Jewish, which is not a region, per se, but a genetically distinct group.

Clicking on each individual location in the left sidebar will bring up more information on the right about that region. For example, clicking on Great Britain tells us that DNA associated with this region is primarily found in England, Scotland, and Wales, but is also found in Ireland, France, Germany, Denmark, Belgium, Netherlands, Switzerland, Austria, and Italy. Basically, this is telling us that people with generations of ancestry in Great Britain are quite a genetic mix from many areas.

GreatBritainRegionThe first chart here shows that if we are to test the DNA of 100 natives of one of these primary regions (England, Scotland or Wales) then 50 of them will have the great Britain “pattern” of DNA covering 60% or more of their entire genome, and 50 of them will have that pattern in less than 60% of their DNA. The fact that this half-way number is so low, only 60%, tells us that there is a lot of uncertainty in this ethnicity estimate because there is so much mixture in this region. Kate, for you that means that when you see Great Britain in your ethnicity estimate, it could mean England, or maybe it means Italy- Ancestry can’t be certain.

IrelandRegionBut that uncertainty isn’t the same for every region. Pictured here is also the ethnicity chart for Ireland. You can see that half the people who are native to Ireland will have 95% or more Irish DNA.  Kate, for us this means that if you have Irish DNA in your results, you can be pretty certain it came from Ireland. From these tables you can see your membership in some regions is more robust than others, and Ancestry is using these tables to try to help us tell the difference.

 

In the end, the ethnicity results reported by each DNA testing company are highly dependent on two factors: the reference populations they use to compare your DNA against, and the statistical algorithms they use to compute your similarities to these populations. Every company is doing both of these things just a little bit differently.

Kate, if you want to get another take on your ethnicity results, you can take your data over to Family Tree DNA, or you can be tested at 23andMe. A free option is to head over to Gedmatch and try out their various ethnicity tools. If you need help downloading and transferring, you can head over to my website: http://www.yourdnaguide.com/transferring.  Most people have found after searching in multiple places that their “true” results are probably somewhere in the middle.

While these ethnicity results can be interesting and useful, for most they will just be a novelty; something interesting and exciting. I have found that their most useful application is acting like a fly on a fishing line. They attract our family members into DNA testing where we can then set the hook on the real goal: family history.

Using DNA for Genealogy Ancestry Family Tree DNA GuidesIf you’re ready to bait your own hook, I recommend you check out my series of DNA quick guides. These laminated guides will help you choose the right DNA tests for your genetic genealogy questions. You’ll become a smart shopper, more prepared to choose the testing company that’s right for you. And you’ll be prepared to maximize your results from each company, rather than look at them blankly and wondered what the heck you just spent that money on. Click here to see all my DNA guides: I recommend the value-priced bundle!