New advancements in DNA technology over the last decade or so have given birth to an easier and more reliable way to match people and their relatives. But if you are someone who has taken a DNA test, you may have already discovered that you have 100s of cousins with various degrees of separation. So what do first, second, third and fourth cousins mean? Additionally, why do they have to be “once” or “twice” removed when you are building a family tree? In this article, I will break down the myths and facts of finding online DNA relatives when you receive your DNA results.
What are the chances of two people having the same DNA?
I think it is important to answer this question first as you will understand how DNA is shared between siblings and other members of the family. The short answer is that only some identical twins share the same DNA. You typically inherit 50% from each of your parents. You inherit mitochondrial DNA from your mother and this is how the mitochondrial haplogroup is determined. If you are a male, you will have the Y chromosome and this is inherited from your father. The DNA matches (in %) is calculated using autosomal DNA inherited from your parents. So as you can see from the below figure, the maximum you can inherit from your biological parents is 50%.
Recent studies have shown that even identical twins carry 10-15 mutations/variations. This suggests that even identical twins may not have the same DNA even though the same genetic information is passed on from their parents.
Based on a study published by AncestryDNA if you have a British descent, you are likely to have an average of 193,000 cousins (5 first cousins, 28 second cousins, 175 third cousins, 1,570 fourth cousins, 17,300 fifth cousins, and 174,000 sixth cousins). Definitely, too many cousins! Here is another crazy data point: for an 8th cousin, this number can go upto 590,000.
|Generation||DNA match (in %)|
|Great Great grandparents||6.25|
|Great Great Great grandparents||3.125|
Bottomline, for every generation that is added, there is more dilution in your ancestral DNA. First generation siblings share 50% of the genetic code from each parent. The second generation shares half of this 25% of their grandparents. The third generation shares 12.5% of their great grandparents. This brings us to the question of what percentages are shared between cousins? Below is a good matrix that gives you the percentages of DNA shared between cousins in case you are interested. You share ~0.2% of DNA with a fourth cousin.
Having distant DNA relatives can help you find new genealogical information about your family tree. All cousins share a common ancestor and that’s the basis of determining your fourth cousin DNA match. But you must be wondering, how accurate are these percentages shared by genealogy companies? Are these really my DNA relatives? If you have taken a test from 23andMe, the chances of finding your fourth cousin is around 45%. If you have taken a test from AncestryDNA, the chances of finding your fourth cousin is 71%. As you can imagine, as the degree of similarity (generation) reduces, the chances of finding your DNA relative increases dramatically.
Second cousin or Fourth cousin, once removed?
When your sibling has a child, you call them a nephew/niece. But when your first cousin has a child, technically that child becomes your first cousin once removed because now there is a one-generation difference between you and your first cousin’s child. Similarly, if there is a two-generation difference, then the relationship becomes first cousin twice removed.
How accurate are the DNA relationships shown by genetic genealogy testing companies?
To estimate the accuracy of the DNA relatives shown by major dna testing companies, it is important to understand two components:
- The match thresholds – Each testing company uses slightly different SNP (marker) content. I am sure you would have heard of 23andMe using v3, v4, v5 chips. Ancestry DNA has its own versions of the chips. Each of these chips have different content and based on the content, each company sets a match threshold to call a “DNA relation” to be accurate. The match threshold is set by measuring the genetic distance.
- Genetic distance in centimorgans (cM) – In genetics, length/distance of a DNA segment is calculated in centimorgans (cM). This is very similar to the physical measurement that we use to calculate distance. Imagine you inherited a segment from your parents, the length of this segment (Identity By Descent IBD segment) is expressed in centimorgans. Each testing company has tuned their algorithms with a slightly different threshold to predict a true match.
|DNA testing company||Match threshold|
|Family Tree DNA||9 centimorgans (cM) or more|
|23andMe||7 centimorgans (cM) and 700 SNPs for the first segment|
|Ancestry DNA||6 centimorgans (cM)|
While it’s hard to determine the validation of all the match lists identified by these companies, it is reasonable to say that up until third cousins, most genetic testing companies have >90% of detecting a match accurately.
Why are 4th cousin matches interesting?
My siblings and I can (with shared knowledge and parental inputs) identify upto 3rd cousins fairly accurately. But if we were to start looking out beyond that, it becomes challenging unless you happen to know your 4th cousin because they live in close proximity (same village/neighbourhood or as extended families co-inhabiting shared spaces).
4th cousin matches are interesting because this is when you realize that your family can be quite large: 1500+ fourth cousins and at the same time when you talk to them you would be able to find some commonality; either a surname/family name or great great grandparents or some other nugget of family history. In summary, you can actually have a meaningful relationship with your fourth cousin if you are able to get hold of them through DNA testing and genetic tools.