Design of experiments for fine-mapping quantitative trait loci in livestock populations

Abstract Background Single nucleotide polymorphisms (SNPs) which capture a significant impact on a trait can be identified with genome-wide association studies. High linkage disequilibrium (LD) among SNPs makes it difficult to identify causative variants correctly. Thus, often target regions instead...

Full description

Bibliographic Details
Main Authors: Dörte Wittenburg, Sarah Bonk, Michael Doschoris, Henry Reyer
Format: Article
Language:English
Published: BMC 2020-06-01
Series:BMC Genetics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12863-020-00871-1
_version_ 1818007676630597632
author Dörte Wittenburg
Sarah Bonk
Michael Doschoris
Henry Reyer
author_facet Dörte Wittenburg
Sarah Bonk
Michael Doschoris
Henry Reyer
author_sort Dörte Wittenburg
collection DOAJ
description Abstract Background Single nucleotide polymorphisms (SNPs) which capture a significant impact on a trait can be identified with genome-wide association studies. High linkage disequilibrium (LD) among SNPs makes it difficult to identify causative variants correctly. Thus, often target regions instead of single SNPs are reported. Sample size has not only a crucial impact on the precision of parameter estimates, it also ensures that a desired level of statistical power can be reached. We study the design of experiments for fine-mapping of signals of a quantitative trait locus in such a target region. Methods A multi-locus model allows to identify causative variants simultaneously, to state their positions more precisely and to account for existing dependencies. Based on the commonly applied SNP-BLUP approach, we determine the z-score statistic for locally testing non-zero SNP effects and investigate its distribution under the alternative hypothesis. This quantity employs the theoretical instead of observed dependence between SNPs; it can be set up as a function of paternal and maternal LD for any given population structure. Results We simulated multiple paternal half-sib families and considered a target region of 1 Mbp. A bimodal distribution of estimated sample size was observed, particularly if more than two causative variants were assumed. The median of estimates constituted the final proposal of optimal sample size; it was consistently less than sample size estimated from single-SNP investigation which was used as a baseline approach. The second mode pointed to inflated sample sizes and could be explained by blocks of varying linkage phases leading to negative correlations between SNPs. Optimal sample size increased almost linearly with number of signals to be identified but depended much stronger on the assumption on heritability. For instance, three times as many samples were required if heritability was 0.1 compared to 0.3. An R package is provided that comprises all required tools. Conclusions Our approach incorporates information about the population structure into the design of experiments. Compared to a conventional method, this leads to a reduced estimate of sample size enabling the resource-saving design of future experiments for fine-mapping of candidate variants.
first_indexed 2024-04-14T05:19:53Z
format Article
id doaj.art-bb3d5b03773c40b1a3dbde2148d37933
institution Directory Open Access Journal
issn 1471-2156
language English
last_indexed 2024-04-14T05:19:53Z
publishDate 2020-06-01
publisher BMC
record_format Article
series BMC Genetics
spelling doaj.art-bb3d5b03773c40b1a3dbde2148d379332022-12-22T02:10:15ZengBMCBMC Genetics1471-21562020-06-0121111410.1186/s12863-020-00871-1Design of experiments for fine-mapping quantitative trait loci in livestock populationsDörte Wittenburg0Sarah Bonk1Michael Doschoris2Henry Reyer3Leibniz Institute for Farm Animal Biology, Institute of Genetics and BiometryUniversity Medicine Greifswald, Department of Psychiatry and PsychotherapyLeibniz Institute for Farm Animal Biology, Institute of Genetics and BiometryLeibniz Institute for Farm Animal Biology, Institute of Genome BiologyAbstract Background Single nucleotide polymorphisms (SNPs) which capture a significant impact on a trait can be identified with genome-wide association studies. High linkage disequilibrium (LD) among SNPs makes it difficult to identify causative variants correctly. Thus, often target regions instead of single SNPs are reported. Sample size has not only a crucial impact on the precision of parameter estimates, it also ensures that a desired level of statistical power can be reached. We study the design of experiments for fine-mapping of signals of a quantitative trait locus in such a target region. Methods A multi-locus model allows to identify causative variants simultaneously, to state their positions more precisely and to account for existing dependencies. Based on the commonly applied SNP-BLUP approach, we determine the z-score statistic for locally testing non-zero SNP effects and investigate its distribution under the alternative hypothesis. This quantity employs the theoretical instead of observed dependence between SNPs; it can be set up as a function of paternal and maternal LD for any given population structure. Results We simulated multiple paternal half-sib families and considered a target region of 1 Mbp. A bimodal distribution of estimated sample size was observed, particularly if more than two causative variants were assumed. The median of estimates constituted the final proposal of optimal sample size; it was consistently less than sample size estimated from single-SNP investigation which was used as a baseline approach. The second mode pointed to inflated sample sizes and could be explained by blocks of varying linkage phases leading to negative correlations between SNPs. Optimal sample size increased almost linearly with number of signals to be identified but depended much stronger on the assumption on heritability. For instance, three times as many samples were required if heritability was 0.1 compared to 0.3. An R package is provided that comprises all required tools. Conclusions Our approach incorporates information about the population structure into the design of experiments. Compared to a conventional method, this leads to a reduced estimate of sample size enabling the resource-saving design of future experiments for fine-mapping of candidate variants.http://link.springer.com/article/10.1186/s12863-020-00871-1Single nucleotide polymorphismStatistical powerTarget regionSNP-BLUPLinkage disequilibrium
spellingShingle Dörte Wittenburg
Sarah Bonk
Michael Doschoris
Henry Reyer
Design of experiments for fine-mapping quantitative trait loci in livestock populations
BMC Genetics
Single nucleotide polymorphism
Statistical power
Target region
SNP-BLUP
Linkage disequilibrium
title Design of experiments for fine-mapping quantitative trait loci in livestock populations
title_full Design of experiments for fine-mapping quantitative trait loci in livestock populations
title_fullStr Design of experiments for fine-mapping quantitative trait loci in livestock populations
title_full_unstemmed Design of experiments for fine-mapping quantitative trait loci in livestock populations
title_short Design of experiments for fine-mapping quantitative trait loci in livestock populations
title_sort design of experiments for fine mapping quantitative trait loci in livestock populations
topic Single nucleotide polymorphism
Statistical power
Target region
SNP-BLUP
Linkage disequilibrium
url http://link.springer.com/article/10.1186/s12863-020-00871-1
work_keys_str_mv AT dortewittenburg designofexperimentsforfinemappingquantitativetraitlociinlivestockpopulations
AT sarahbonk designofexperimentsforfinemappingquantitativetraitlociinlivestockpopulations
AT michaeldoschoris designofexperimentsforfinemappingquantitativetraitlociinlivestockpopulations
AT henryreyer designofexperimentsforfinemappingquantitativetraitlociinlivestockpopulations