An introduction to statistics for statistical genetics: models and techniques common in statistical genetics

O'Reilly, Paul

We noted you are experiencing viewing problems

Check with your IT department that JWPlatform, JWPlayer and Amazon AWS & CloudFront are not being blocked by your network. The relevant domains are *.jwplatform.com, *.jwpsrv.com, *.jwpcdn.com, jwpltx.com, jwpsrv.a.ssl.fastly.net, *.amazonaws.com and *.cloudfront.net. The relevant ports are 80 and 443.
Check the following talk links to see which ones work correctly:
Auto Mode
HTTP Progressive Download Send us your results from the above test links at access@hstalks.com and we will contact you with further advice on troubleshooting your viewing problems.
No luck yet? More tips for troubleshooting viewing issues
Contact HST Support access@hstalks.com

Please review our troubleshooting guide for tips and advice on resolving your viewing problems.
For additional help, please don't hesitate to contact HST support access@hstalks.com

We hope you have enjoyed this limited-length demo

Request free trial
Recommend to your librarian

Share
Share This Talk
Messaging

Outlook

Gmail

Yahoo!

WhatsApp
Social

Facebook

X

LinkedIn

VKontakte
Permalink
Replay Talk

This is a limited length demo talk; you may login or review methods of obtaining more access.

Slides
Topics
Links
Citation

Printable Handouts

PDF

Topics Covered

Hidden Markov models
Imputation and genetic imputation
Principal Component Analysis (PCA)
Mixed Models
Shrinkage and regularisation methods

Links

Series:

Statistical Genetics

Categories:

Talk Citation

O'Reilly, P. (2025, March 28). An introduction to statistics for statistical genetics: models and techniques common in statistical genetics [Video file]. In The Biomedical & Life Sciences Collection, Henry Stewart Talks. Retrieved July 1, 2026, from https://doi.org/10.69645/XROV3727.
Export Citation (RIS)

Publication History

Published on July 31, 2017
Reviewed on March 28, 2025

Financial Disclosures

Dr. Paul O'Reilly has not informed HSTalks of any commercial/financial relationship that it is appropriate to disclose.

Embed in course/own notesEmbed Lecture

An introduction to statistics for statistical genetics: models and techniques common in statistical genetics

Dr. Paul O'Reilly – King's College London, UK

Published on July 31, 2017 Reviewed on March 28, 2025 18 min

Review
Share
Share This Talk
Messaging

Outlook

Gmail

Yahoo!

WhatsApp
Social

Facebook

X

LinkedIn

VKontakte
Permalink
Save

A selection of talks on Methods

30 min

Prof. Aldrin V. Gomes
University of California, Davis, USA

39 min

Dr. Reuben Adatorwovor
University of Kentucky, USA

44 min

Dr. Rana Ismail
Michigan State University, USA

Audio Interview

17 min

Prof. Eytan Ruppin
National Institutes of Health (NIH), USA

Audio Interview

18 min

Dr. Shyam Panjwani
Bayer Pharmaceuticals, USA

41 min

Mr. Atul Mathur
Content Alive, Singapore

33 min

Dr. Martin Buescher
Head of Biophysics at Miltenyi Biotec, Germany

30 min

Prof. Dmitri Rusakov
University College London, UK

31 min

Prof. Giuseppe Lippi
University of Verona, Italy

33 min

Dr. Thomas W. MacFarland
Nova Southeastern University, USA

26 min

Dr. Andrei A. Bunaciu
S.C. AAB_IR research S.R.L., Romania

16 min

Dr. Evie Kendal
Swinburne University of Technology, Australia

24 min

Dr. Robert Hammond
University of St Andrews, UK

38 min

Prof. Ana Conesa
National Research Council, Spain

41 min

Prof. Lei Liu
Tsinghua University, China

47 min

Dr. Brennan Kahan
University College London, UK

Transcript

Please wait while the transcript is being prepared...

0:00

This lecture, An Introduction to Statistics for Statistical Genetics, is the first talk in the statistical genetics series. I'm Dr. Paul O'Reilly, a senior lecturer in statistical genetics, performing research at King's College London. This is Part Two: Models and Techniques Common in Statistical Genetics. In this section of the talk, I will give a basic overview of several models and techniques popular in statistical genetics, with the aim of providing an introductory level understanding of each. If you plan to use any of these approaches, then you will need to obtain further details from other lectures in this series, in statistical textbooks, or online.

0:38

In part two of the talk, I'll introduce a number of statistical models and techniques that are often used in statistical genetics. First, I'll describe Hidden Markov models. Then I'll explain the process of statistical imputation, as well as a method of imputation especially tailored for application to genetic data. Next I'll explain principal component analysis, and then mixed models, and finally, I'll describe shrinkage and regularisation methods. For each of these I'll give the general intuition of the model or technique and explain its relevance to statistical genetics using examples from the field.

1:13

A process has the Markov property if the next state, in space or time, is governed only by the present state. A useful way to think about this in a real life context is to consider the weather. If it is raining now, it doesn't matter too much that it was dry and sunny yesterday, it's very likely to be still raining in one minute from now. Strictly speaking weather isn't Markovian since the recent weather or present season is also informative about future weather; But over short periods of time weather is close enough to Markovian to be a useful analogy. If we model a system or process as having the Markov property, then it is called a Markov model. A Markov model is typically made up of some number of possible states, which in the case of weather might be raining, snowing, sunny, and overcast, along with transition probabilities of switching from one state to another. The transition probabilities may be different for different transitions. For example, going from overcast to snowing has higher probability than going from sunny to snowing. Markov models are extremely useful in analysing genetic data because the ancestral contributions to our DNA sequence are highly Markovian. Consider a chromosome that came from either your mother or father. This chromosome will be a mosaic of your grandparents chromosomes as a result of recombination. It can be viewed as being made up of two states, either grandmother or grandfather, and if at a particular locus, the sequence is from your grandmother, then at the very next locus this is most likely from your grandmother as well, but with some probability there will be a transition to sequence that came from your grandfather. Likewise, we can view our chromosomes as being a mosaic of our great-grandparents chromosomes or of our ancestors from any number of generations ago. Regions with high recombination rates will involve many transitions between these ancestral sequence states, whereas those with little recombination may correspond to only a single ancestral state. Because of the relatedness among all individuals, this means that a sample of individual's chromosomes, and genetic variation data in general, can be well captured by Markov models. In practice, Hidden Markov models, or HMMs for short, are usually employed, because the ancestral states are unknown but genotype data can be used to estimate them. Going back to the weather analogy, applying hidden Markov Models is a bit like trying to estimate the state of the weather only from data on what clothes people are wearing or if they're applying sun cream or holding umbrellas. In genetics, our observed data are usually genotypes, and we can use these to estimate different hypothetical ancestral sequence underlying the genotypes at a genomic locus in a sample of individuals. This has the effect of clustering the sample of DNA sequences into groups of similar sequence. The HMMs are also used to estimate when there are transitions between different ancestral sequences along individual chromosomes. By capturing the structure of genetic variation data in samples of individuals in a way that reflects their present similarities and differences, and ancestral histories, Hidden Markov Models are extremely useful in statistical genetics and have been employed in a wide range of applications including estimating haplotypes from genotypes, identifying copy number variants, characterising population admixture, and in performing genetic imputation. The problem of missing data plagues medical and scientific research,

Quiz

Quiz available with full talk access. Request Free Trial or Login.

Show

Hide

Share
Share This Talk
Messaging

Outlook

Gmail

Yahoo!

WhatsApp
Social

Facebook

X

LinkedIn

VKontakte
Permalink
More actions
- Handouts
- Save

An introduction to statistics for statistical genetics: models and techniques common in statistical genetics

Embed in course/own notes

See Options

Login via your organisation

We noted you are experiencing viewing problems

We hope you have enjoyed this limited-length demo

Share This Talk

Messaging

Social

Permalink

Printable Handouts

Navigable Slide Index

Topics Covered

Links

Series:

Categories:

Talk Citation

Publication History

Financial Disclosures

An introduction to statistics for statistical genetics: models and techniques common in statistical genetics

Share This Talk

Messaging

Social

Permalink

A selection of talks on Methods

Transcript

Quiz

Share This Talk

Messaging

Social

Permalink

An introduction to statistics for statistical genetics: models and techniques common in statistical genetics