As noted some time ago,
I signed up for the genetic analysis company 23andMe and have got some interesting new connections as a result. 23andMe gives you a bewildering amount of information, not always hugely useful, but enough that you can do your own playing around with it.
One chunk of information that I have been browsing is the precise list of what bits of DNA I share with other 23andMe users. The site allows you to download data on your 1500 or so closest DNA relatives in its system, including the actual chromosome sequences that you share with those who have given you permission to share. (Which for me is 1200 of the 1500.)
Now, if those 1200 people were randomly chosen from my wider family tree, you would expect that there would not be a lot of variation in which bits of DNA are shared by how many. In fact there is a great deal of variation. 15% of my total genome is not shared with any of the other 23andMe users. Of course, very few of them are related to me more closely than fourth cousin, so it's entirely possible for whole strands of genealogical relationship to disappear completely from the genetic record.
I have identified precise relationship links with a handful of other 23andMe users, from quite a small number of lineages:
3 are descended from my 3x-great-grandparents Clarke McIlroy (1813-1891) and Mary Perry (1818-1880), my maternal grandmother's maternal grandfather's parents. Two of them are parent and child, and obviously the child's 0.41% DNA overlap with me is a subset of the parent's 0.84%; the other is from a separate line of descent and is linked to me by a separate chunk of DNA, a rather weak 0.28% overlap.
1 is descended from my paternal grandmother's maternal grandparents, my 2xgreat-grandparents Samuel Morris Wickersham (1819-1894) and Frances Wyatt Belt (1837-1912). He is much the most closely related to me, sharing 1.89% of my DNA in six different chunks.
1 is descended from my paternal grandmother's paternal grandparents, my 2xgreat-grandparents William Charlton Hibbard (1814-1880) and Sarah Ann Smith (1815-1891), my 2xgreat-grandparents. We have only 0.41% of our DNA in common.
4 are descended from the Whyte/Ryan connection on my paternal grandfather's side. This is a little complicated. One of them (sharing 1.04% of my DNA) is descended from my great-grandparents John Joseph Whyte (1826-1916) and Caroline Letitia Ryan (1843-1921). But they were first cousins once removed. The other three are descended from Caroline's parents, George Ryan (1791-1875) and Catharine Margaret Whyte (1818-1884). Catherine was John Joseph's first cousin, so presumably already shared some of his DNA. He was also more distantly related to George Ryan on his mother's side. Anyway, the other three Whyte/Ryan DNA connections share 1.46%, 1.27% and 0.49% of their DNA with me (the sibling of the second is the parent of the third).
That still leaves one pair of great-grandparents for whom I have not yet established a line of connection to anyone on 23andMe. And while all four pairs of 2x-great-grandparents on my father's side are accounted for, only one of the four on my mother's side is. It's not all that surprising; my father's mother was American, and they seem to be the biggest user base for 23andMe, and his father's family were a bit obsessional with keeping records so are easier to map.
To sum it up graphically, I hope:
Now, the next bit is something I haven't seen done before, and it's a work in progress. I have mapped out exactly how many of the 23andMe users share DNA with me at every point of the genome. There are some suggestive results. Here, for instance, is my Chromosome 3, showing the overlaps with three of the groups mentioned above.
The first green patch is shared with the McIlroy/Perry connection, the orange patch is the single chunk of Hibbard/Smith DNA, and the yellow is one of the many Wickersham/Belt patches.
We can maybe get a little further by shading DNA shared with users whose DNA overlaps with those who I have firmly identified, as they are a little more likely than not to originate from the same source:
The best chromosome here is Chromosome 4:
A lot of this has direct overlaps - a chunk of Whyte/Ryan at the beginning, a chunk of Wickersham/Belt in the middle, the a chunk of McIlroy/Perry and finally another chunk of Wickersham/Belt. It does give you a sense of how DNA is transferred in discrete blocks.
As I said, 3 and 4 are the two best chromosomes for showing links; there are several where none of the DNA belongs to any of the four groups that I have identified. It's all rather pretty, but it doesn't leave me much the wiser.
Two particular mysteries have emerged for me. The pattern of matches on my X chromosome demonstrates very clearly just how chunky the process of DNA transfer betwen the generations can be, much more so than any of the others. To remind you, for us people with an X and a Y chromosome, we inherit only an X chromosome from our parent who had two of them (and a Y chromosome from the parent who had one); for those of you with two X chromosomes, the DNA on them is inherited from each parent like the other 23 pairs. I wonder if this more chunky pattern is typical of X chromosomes, given the slightly different path through which they are passed on? It's also pretty clear that there are plenty of 23andMe users on my mother's side, though none of the identified McIlroy/Perry DNA is here, and of course there is none from my father's side.
The other mystery is an exceptional patch on Chromosome 1. The peak of 57 people sharing a bit of X-Chromosome DNA with me is unusual; there are only two higher peaks, one on Chromosome 8 with 60 people, and then a real anomaly on Chromosome 1. (Chromosome 1 also has three separate patches of identifiable Whyte/Ryan DNA, the first two of which have a couple of other people overlapping both.)
So yes, there is a very clearly defined chunk of DNA there shared by me and over 160 other people - 13% of the 23andMe users who share any DNA with me at all, share that bit. But none of them are people for whom who I have identified a genealogical link. It's such a massive contrast that I assume it must be one of those bits of DNA that is actually very widely shared, possibly even giving some genetic advantage to those who possess it. None shares more than 0.43% of DNA with me, which suggests at least eight generations of difference, which becomes very difficult to track down.
But I am going to have to leave it there. My technical knowledge simply is not good enough to take this much further, and there's a limit to how far you can take the 23andMe evidence. Something to chew on for the future.