Ghost DNA -

Image by Midjourney.com

Every person’s genome has a few stretches of DNA that stand out as different from most of humanity. Some are traces of ghost populations: ancient groups that became extinct, but not before contributing some of their own genes to other populations that survived.

Geneticists have identified some of these ghosts. Everyone shares some of these quirky sequences with Neanderthals or Denisovans, long-lost groups that persist today only within living people’s genomes. But other groups remain a mystery.

Ideas about ghost populations have become central to our understanding of human origins and evolution of other mammals. I’ve written about ghost hominins, ghost bonobos, and ghost gorillas.

Even so, the idea covers up a lot of uncertainty about how human groups were connected with each other in the past. Every “ghost” is a mathematical construct waiting for DNA from ancient skeletons to test whether it was real.

Ghost populations are deduced from statistics, with data from living people and from ancient skeletons where it exists. When two populations divide from each other for a long time and then come back together, telltale signs mark the genomes of their descendants, even after thousands of years.

Mutations that are near each other on the same chromosome tend to be inherited together. This genetic linkage results in barcode-like arrangements of mutations across thousands of base pairs known as haplotypes. If a population never changed in size and never interacted with distant relatives, then haplotype lengths, haplotype frequencies, and mutational diversity all would tend toward a predictable pattern.

But when a very different population enters the mix, it has its own unique set of haplotypes that have not remixed with those in the first population. They’re long, they have different mutations, and that makes them stick out.

Geneticists started noticing such long, divergent haplotypes in living human populations during the 1990s. When I first entered human genetics research, I wondered whether some of these divergent haplotypes came from Neanderthals. I was far from alone. Several research groups in the early 2000s were taking a close look at genes with long, divergent haplotypes.

One of my collaborative projects focused on a gene called MCPH, which had a long haplotype that didn’t seem to fit with the idea that humans had come from a single small ancestral group. We guessed it was likely Neanderthal in origin.

Another research team led by Michael Hammer focused on two regions of the X chromosome that likewise showed very deep haplotype diversity, both in East Asian populations. That was a bit puzzling. The possible source wasn’t clear at the time. No one used the term “ghost population” back then, but today we would call this a sign of one.

Meanwhile, a few groups of geneticists were beginning to develop statistics to handle thousands of different haplotypes across the whole genome. I remember being blown away by a groundbreaking paper in 2006 written by Vincent Plagnol and Jeffrey Wall. They estimated that 5% of European ancestry came from some kind of ghost population. To scientists like me, that conclusion made good sense: this ghost population must have been the Neanderthals.

What was harder to explain was Plagnol and Wall’s result looking at the genomes of West African people. They, too, had inherited around 5% of their genomes from some divergent ancestral group. Neanderthals didn’t seem likely—no one had ever found them in Africa. This was the first strong evidence pointing to a ghost population in human ancestry.

“[A]rchaic populations such as Neanderthals must have made a substantial contribution to the modern gene pool in Europe. We observe a similar pattern for West African populations even though a clear source population has not yet been found.”—Vincent Plagnol and Jeffrey Wall

In those early days, skeptical geneticists were hard to convince about such ancestors. Then in 2010 the Neanderthal genome appeared. Instantly it became clear that most people with heritage from Europe, Asia, the Americas, and Oceania had some Neanderthal ancestors, amounting to between 1–4% of their genomes.

Ancient DNA findings from Denisova Cave, also published in 2010, changed the game even further. In the lead-up to that discovery, some geneticists studying people’s genomes from Papua New Guinea had noticed what seemed like traces of mixture from an unknown group. Again they wondered, could it be Neanderthals?

The genome of Denisova 3 revealed the true answer: Denisova-like populations had once been widespread across East Asia and Southeast Asia.

In their studies of ancient and modern genomes, a team of geneticists led by David Reich and Nick Patterson innovated new kinds of analyses that sometimes pointed to ghost populations: tests of population relationships with f-statistics. These tests are especially useful for ancient genomes because they work by simply counting shared genetic variations, not relying on the lengths of haplotypes that are broken apart in fragmented DNA from ancient bones. The most informative is the f^₄–statistic, which considers four individuals from four populations and assesses whether their pattern of shared genetic variations fits the tree connecting the four. Another test pioneered by the same team, called the D-statistic, relies on similar logic and was also widely used to examine mixture in past groups.

When these tests show unexpected patterns of shared variation, one explanation is that one of the genomes has some ancestors from yet another group. Often the source of this ancestry can be found by applying the test again and again to more groups.

But sometimes that search comes up empty. Then there may be a ghost in the genome.

A ghost population is no more nor less than a mathematical construct. When geneticists plug a set of DNA samples into a model to understand their connections, sometimes the model spits out an unknown component, a ghost.

“All models are wrong, but some models are useful.” What George Box wrote about statistical models holds true in this case.

Some models are easy to build and test. Easiest are models with populations that stay isolated after they split, except for the rare times of mixing between them. These models operate a bit like populations that lived on islands, and geneticists call them “island models”. When such models spit out a lineage, it looks like a ghost.

But humans across Africa and Eurasia during the Pleistocene did not live on a handful of islands. If their genomes have few signs of mixing over time, it’s not because none of them ever met. More likely, their long-term evolution was driven by growth in some regions much more than others. Places where different groups met were not often centers of any population’s growth.

Two groups on two islands, or different rates of growth and limits on mixture in different parts of a single population’s range. These are factors that geneticists call population structure.

For many years, some geneticists were skeptical of the idea that divergent haplotypes in living people might be a legacy of Neanderthal ancestors. They noted that certain kinds of population structure in ancient Africans might also result in such haplotypes. It took an array of DNA data from Neanderthals and better knowledge of haplotype variation in living humans to accept Neanderthal ancestors.

Ancestral population structure in Africa may be an alternative to the “ghost archaics”. Aaron Ragsdale and coworkers in 2023 took a critical look at the ghost population scenario for African ancestral groups. The found that a population structure model, with continuous mixing between populations, can also explain data from living African peoples. A similar argument was suggested by Tiago Ferraz and collaborators for the Population Y hypothesis in South America: Population structure within a single founder group of populations in the Americas, not a second founder population.

Still, I’ve become a believer in ghosts. We know that every model is wrong in some ways. The ghost model is often a useful one. The point of talking about ghosts is to remind us of the past realities that we cannot see with today’s data.

And to be honest, unknown population structure looks an awful lot like a ghost. If African groups began to differentiate more than a million years ago, and today’s African people get 95% of their ancestry from one part of that early population and 5% from others, that sounds a lot like a ghost population. If early modern humans diversified rapidly across much of Africa when they emerged 250,000 years ago, but the groups that came to occupy one area are only visible as a small fraction of the genomes of today’s people, that sounds a lot like a ghost population.

So I’ve come to like the term “ghost population” quite a lot. There’s only one thing that I try to remind people. These ghosts are not dead. Theirs is a legacy of genetic persistence, of mixing into groups where their descendants survived. Ghost populations are ancestors we haven’t recognized yet.

John Hawks – Paleoanthropologist | Chair and Professor of Anthropology, University of Wisconsin–Madison

Read more @ https://www.johnhawks.net/p/ghost-populations-in-human-origins