Human migration and population structure 2

Published on March 18, 2015   35 min

You are viewing a talk that is a part of one of our comprehensive courses. Additional learning material: case studies, projects, workshops and recommended reading; multiple choice questions and suggested exam questions with model answers are available on application. Learn more

Other Talks in the Series: Human Population Genetics II

0:04
So now let's turn towards some data. And thus far we've only been talking about classical models of population structure and thinking through, just conceptually, how migration drift and natural selection might be impacting allele frequencies. That gives us some intuition of how we might be able to infer things from data. But now let's really look at that problem directly.
0:27
So let's begin by looking at gene genealogies or gene trees that are inferred from sequence data.
0:35
What I'm showing in this slide is data from Australian grass finches where there are two sub species of long-tailed finch and one species of black throated finch. And the long-tailed finches are more closely related to each other than the black throated finch. When we actually look at gene trees from these three species and by gene tree I mean, a tree that is inferred to represent the relationship between sequences that have been taken from a single DNA molecule in this case, from each of the three sub species the acuticauda, hecki, and cincta. So I'm showing 30 different gene trees from anonymous loci from the genome. The average sequence used to construct this gene trees is 553 base pairs. And only three individuals was needed, one per species, to construct these. What we see in A is that a set of the gene trees support a relationship where the acuticauda and hecki are more closely related than the cincta. In B there were set of gene trees in which acuticauda is more closely related to cincta than to hecki. And then finally, in C, we have a set of gene trees where hecki and cincta are more closely related than acuticauda and D is fourth and final case where the sequence data doesn't actually allow resolution to the relationships. So this is, on the surface, a bit confusing. Why do we have different topologies of gene trees when there's a single underlying species tree and what can we maybe learn from this? To answer those questions, we need to think about the coalescent process.