Statistical methodology for QTL mapping and genome-wide association studies

<p>This work deals with statistical tests of association between genetic markers and disease phenotypes. The main criterion used for comparing the tests is statistical power. First we consider animal models and then data from association studies of humans. For the animal section, we analyse a...

Full description

Bibliographic Details
Main Author: Antonyuk, A
Format: Thesis
Published: 2009
_version_ 1797058310961102848
author Antonyuk, A
author_facet Antonyuk, A
author_sort Antonyuk, A
collection OXFORD
description <p>This work deals with statistical tests of association between genetic markers and disease phenotypes. The main criterion used for comparing the tests is statistical power. First we consider animal models and then data from association studies of humans. For the animal section, we analyse a dataset from a prominent mouse experiment which developed a heterogeneous stock of mice via multiple crosses. This stock is characterised by small distances between recombinants which allows fine mapping of genetic loci, but also by uncertainty in haplotypes. We start by highlighting the disadvantages of the currently used approach to deal with this uncertainty and suggest a method that has greater statistical power and is computationally efficient. The method applies the EM algorithm to the broad class of exponential family distributions of phenotypes. We also develop a Bayesian version of the method, for which we extend the widely used IRLS algorithm to maximisation of the weighted posterior.</p> <p>Then we move on to genome-wide association studies (GWAS), where two situations are considered: known and unknown minor allele frequency. First we develop an innovative Bayesian model with the optimal prior for the known population MAF. We demonstrate that not only it is more powerful than any frequentist test considered (the size of the advantage depends on prevalence of the disease and MAF), but also that the frequentist tests change ranking in terms of power. A remarkable property of the frequentist tests, the advantage of discarding part of the data to gain power, is highlighted.</p> <p>The second chapter on GWAS considers the currently more common situation of the unknown MAF, when the Armitage test is known to be the most powerful frequentist method. We show that the suggested model is more powerful in the broad selection of settings considered, including the three different allele effect models: additive, dominant and recessive.</p> <p>For both known and unknown MAF cases we point out that the parameters are constrained and demonstrate how to gain power by taking this constraint into account.</p>
first_indexed 2024-03-06T19:48:41Z
format Thesis
id oxford-uuid:23393c76-b7ef-44c2-a06f-3b23e3a6d936
institution University of Oxford
last_indexed 2024-03-06T19:48:41Z
publishDate 2009
record_format dspace
spelling oxford-uuid:23393c76-b7ef-44c2-a06f-3b23e3a6d9362022-03-26T11:43:09ZStatistical methodology for QTL mapping and genome-wide association studiesThesishttp://purl.org/coar/resource_type/c_db06uuid:23393c76-b7ef-44c2-a06f-3b23e3a6d936Polonsky Theses Digitisation Project2009Antonyuk, A<p>This work deals with statistical tests of association between genetic markers and disease phenotypes. The main criterion used for comparing the tests is statistical power. First we consider animal models and then data from association studies of humans. For the animal section, we analyse a dataset from a prominent mouse experiment which developed a heterogeneous stock of mice via multiple crosses. This stock is characterised by small distances between recombinants which allows fine mapping of genetic loci, but also by uncertainty in haplotypes. We start by highlighting the disadvantages of the currently used approach to deal with this uncertainty and suggest a method that has greater statistical power and is computationally efficient. The method applies the EM algorithm to the broad class of exponential family distributions of phenotypes. We also develop a Bayesian version of the method, for which we extend the widely used IRLS algorithm to maximisation of the weighted posterior.</p> <p>Then we move on to genome-wide association studies (GWAS), where two situations are considered: known and unknown minor allele frequency. First we develop an innovative Bayesian model with the optimal prior for the known population MAF. We demonstrate that not only it is more powerful than any frequentist test considered (the size of the advantage depends on prevalence of the disease and MAF), but also that the frequentist tests change ranking in terms of power. A remarkable property of the frequentist tests, the advantage of discarding part of the data to gain power, is highlighted.</p> <p>The second chapter on GWAS considers the currently more common situation of the unknown MAF, when the Armitage test is known to be the most powerful frequentist method. We show that the suggested model is more powerful in the broad selection of settings considered, including the three different allele effect models: additive, dominant and recessive.</p> <p>For both known and unknown MAF cases we point out that the parameters are constrained and demonstrate how to gain power by taking this constraint into account.</p>
spellingShingle Antonyuk, A
Statistical methodology for QTL mapping and genome-wide association studies
title Statistical methodology for QTL mapping and genome-wide association studies
title_full Statistical methodology for QTL mapping and genome-wide association studies
title_fullStr Statistical methodology for QTL mapping and genome-wide association studies
title_full_unstemmed Statistical methodology for QTL mapping and genome-wide association studies
title_short Statistical methodology for QTL mapping and genome-wide association studies
title_sort statistical methodology for qtl mapping and genome wide association studies
work_keys_str_mv AT antonyuka statisticalmethodologyforqtlmappingandgenomewideassociationstudies