Techniques to infer admixture using genome-wide autosomal DNA 1

Published on May 31, 2022   27 min

Other Talks in the Category: Methods

0:00
My name is Garrett Hellenthal at at the UCL Genetics Institute at University College, London. This tutorial will cover methods for detecting what is known as admixture events, which is when two or more populations intermix.
0:15
Specifically, we'll discuss available approaches to detect admixture events when using genome-wide autosomal bi-allelic single-nucleotide polymorphisms (SNP) data, which are loci on the individual's two sequences that can each carry one of two values or allele types. This will not be an exhaustive look at all approaches, of which there are many, but we aim to cover several popular techniques. In particular, we will demonstrate applications of the program ADMIXTURE and and f-statistics approaches and the programs ALDER/ MALDER, Globetrotter and Mosaic. Importantly, ADMIXTURE and f-statistics identify admixture and can be used to infer admixture proportions. Approaches 3-5 here can also date when admixture events occurred. For this tutorial, we will assume that you have access to a command-line terminal e.g. on a Linux machine or a Mac. Here, I provide step by step code to run these programs and discuss some of the output. The idea is that you can pause the video and follow along step by step, waiting for the program to finish at each step. It shouldn't take too long, though some steps do take a bit longer than others. Overall, this tutorial is meant to give you an overview of the basics of running these programs. Please read their original documentation for further information, for example, regarding program options.
1:32
We will use two simulated examples to illustrate each approach. Each of these will consist of 20 simulated admixed "individuals" that each descends from admixture that occurred 30 generations ago. In the first simulation, 80% of the DNA is contributed by Yoruba individuals sampled in Nigeria, and the remaining 20% comes from French individuals. In the second simulation, 50% of the DNA is contributed by Brahui individuals sampled in Pakistan, with the remaining 50% from French individuals. These simulations were generated using the technique described in Price et al. 2009 and are simulations described and used in Hellenthal et al. 2014. To make our simulation scenarios more realistic, we will assume that we did not sample the actual Brahui, French and Yoruba populations were used to simulate the data. If the admixture occurred long ago, for example, 30 generations, which corresponds to about 800-900 years ago, the populations that mixed may not exist anymore or may be genetically somewhat different to the actual groups that did mix. Instead, we will use other proxy or surrogate populations that represent each admixing source. In particular, we will use 21 sampled Balochi individuals, which are another ethnic group in Pakistan, to represent the Brahui admixing source population. Similarly, we will use 23 sampled British and Irish individuals to represent the French admixing source and 22 sampled Senegalese Mandenka individuals to represent the Yoruba source. These groups are all imperfect, but somewhat genetically related proxies to the true admixing source groups.
Hide

Techniques to infer admixture using genome-wide autosomal DNA 1

Embed in course/own notes