Methods for large-scale genome-wide association studies

<p>Genome-wide association studies (GWAS) have led to the identification of thousands of associations between genetic polymorphisms and complex traits or diseases, facilitating several downstream applications such as genetic risk prediction and drug target prioritisation. Biobanks containing e...

Full description

Bibliographic Details
Main Author: Kalantzis, G
Other Authors: Palamara, P
Format: Thesis
Language:English
Published: 2022
Subjects:
Description
Summary:<p>Genome-wide association studies (GWAS) have led to the identification of thousands of associations between genetic polymorphisms and complex traits or diseases, facilitating several downstream applications such as genetic risk prediction and drug target prioritisation. Biobanks containing extensive genetic and phenotypic data continue to grow, creating new opportunities for the study of complex traits, such as the analysis of rare genomic variation across multiple populations. These opportunities are coupled with computational challenges, creating the need for the development of novel methodology.</p> <p>This thesis develops computational tools to facilitate large-scale association studies of rare and common variation. First, we develop methods to improve the analysis of ultra-rare variants, leveraging the sharing of identical-by-descent (IBD) genomic regions within large biobanks. We compare ∼ 400k genotyped UK Biobank (UKBB) samples with 50k exome-sequenced samples and devise a score that quantifies the extent to which a genotyped individual shares IBD segments with carriers of rare loss-of-function mutations. Our approach detects several associations and replicates 11/14 loci of a pilot exome sequencing study. Second, we develop a linear mixed model framework, FMA, that builds on previous techniques and is suitable for scalable and robust association testing. We benchmark FMA and several state-of-the-art approaches using synthetic and UKBB data, evaluating computational performance, statistical power, and robustness to known confounders, such as cryptic relatedness and population stratification. Finally, we integrate FMA with recently developed methods for genealogical analysis of complex traits, enabling it to perform scalable genealogy-based estimation of narrow-sense heritability and association.</p>