Please wait while the transcript is being prepared...
0:00
My name is Garrett Hellenthal at
at the UCL Genetics Institute
at University College, London.
This tutorial will
cover methods for
detecting what is known
as admixture events,
which is when two or more
populations intermix.
0:15
Specifically, we'll discuss
available approaches to
detect admixture events when
using genome-wide
autosomal bi-allelic
single-nucleotide
polymorphisms (SNP) data,
which are loci on the
individual's two sequences that
can each carry one of two
values or allele types.
This will not be an exhaustive
look at all approaches,
of which there are many,
but we aim to cover several
popular techniques.
In particular, we
will demonstrate
applications of the
program ADMIXTURE and
and f-statistics approaches
and the programs ALDER/ MALDER,
Globetrotter and Mosaic.
Importantly, ADMIXTURE
and f-statistics
identify admixture and can
be used to infer
admixture proportions.
Approaches 3-5
here can also date
when admixture events occurred.
For this tutorial, we
will assume that you
have access to a
command-line terminal
e.g. on a Linux
machine or a Mac.
Here, I provide step
by step code to run
these programs and discuss
some of the output.
The idea is that you
can pause the video
and follow along step by step,
waiting for the program
to finish at each step.
It shouldn't take too long,
though some steps do take
a bit longer than others.
Overall, this tutorial
is meant to give you
an overview of the basics
of running these programs.
Please read their
original documentation
for further information,
for example, regarding
program options.
1:32
We will use two
simulated examples
to illustrate each approach.
Each of these will consist of
20 simulated admixed
"individuals"
that each descends
from admixture that
occurred 30 generations ago.
In the first simulation,
80% of the DNA is contributed by
Yoruba individuals
sampled in Nigeria,
and the remaining 20% comes
from French individuals.
In the second simulation,
50% of the DNA is
contributed by Brahui
individuals sampled in Pakistan,
with the remaining 50%
from French individuals.
These simulations were generated
using the technique described
in Price et al. 2009
and are simulations
described and used
in Hellenthal et al. 2014.
To make our simulation
scenarios more realistic,
we will assume that we did
not sample the actual Brahui,
French and Yoruba populations
were used to simulate the data.
If the admixture
occurred long ago,
for example, 30 generations,
which corresponds to
about 800-900 years ago,
the populations that mixed
may not exist anymore
or may be genetically
somewhat different
to the actual groups
that did mix.
Instead, we will use
other proxy or surrogate
populations that represent
each admixing source.
In particular, we will use 21
sampled Balochi individuals,
which are another ethnic
group in Pakistan,
to represent the Brahui
admixing source population.
Similarly, we will use
23 sampled British and Irish
individuals to represent
the French admixing source
and 22 sampled Senegalese
Mandenka individuals
to represent the Yoruba source.
These groups are all imperfect,
but somewhat genetically
related proxies
to the true admixing
source groups.