On multi-marker tests for association in case-control studies

Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic...

Full description

Bibliographic Details
Main Authors: Margaret A Taub, Holger R Schwender, Samuel G Younkin, Thomas A Louis, Ingo eRuczinski
Format: Article
Language:English
Published: Frontiers Media S.A. 2013-12-01
Series:Frontiers in Genetics
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fgene.2013.00252/full
_version_ 1818550291998441472
author Margaret A Taub
Holger R Schwender
Samuel G Younkin
Thomas A Louis
Ingo eRuczinski
author_facet Margaret A Taub
Holger R Schwender
Samuel G Younkin
Thomas A Louis
Ingo eRuczinski
author_sort Margaret A Taub
collection DOAJ
description Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic arrays interrogate hundreds of thousands or even millions of loci simultaneously, many causal yet undetected loci are believed to exist because the conditional power to achieve a genome-wide significance level can be low, in particular for markers with small effect sizes and low minor allele frequencies and in studies with modest sample size. However, the correlation between neighboring markers in the human genome due to linkage disequilibrium (LD) resulting in correlated marker test statistics can be incorporated into multi-marker hypothesis tests, thereby increasing power to detect association. Herein, we quantify the maximum power achievable for multi-marker tests of association in case-control studies, achievable only when the causal marker is known. Using that genotype correlations within an LD block translate into an asymptotically multivariate normal distribution for score test statistics, we develop a set of weights for the markers that maximize the non-centrality parameter, and assess the relative loss of power for other approaches. We find that the method of Conneely and Boehnke (2007) based on the maximum absolute test statistic observed in an LD block is a practical and powerful method in a variety of settings. We also explore the effect on the power that prior biological or functional knowledge used to narrow down the locus of the causal marker can have, and conclude that this prior knowledge has to be very strong and specific for the power to approach the maximum achievable level, or even beat the power observed for methods such as the one proposed by Conneely and Boehnke (2007).
first_indexed 2024-12-12T08:44:30Z
format Article
id doaj.art-c121f9ecb57c4069879c3fdbd3b8fa1e
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-12T08:44:30Z
publishDate 2013-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-c121f9ecb57c4069879c3fdbd3b8fa1e2022-12-22T00:30:37ZengFrontiers Media S.A.Frontiers in Genetics1664-80212013-12-01410.3389/fgene.2013.0025262988On multi-marker tests for association in case-control studiesMargaret A Taub0Holger R Schwender1Samuel G Younkin2Thomas A Louis3Ingo eRuczinski4Johns Hopkins UniversityUniversity of DuesseldorfJohns Hopkins UniversityJohns Hopkins UniversityJohns Hopkins UniversityGenome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic arrays interrogate hundreds of thousands or even millions of loci simultaneously, many causal yet undetected loci are believed to exist because the conditional power to achieve a genome-wide significance level can be low, in particular for markers with small effect sizes and low minor allele frequencies and in studies with modest sample size. However, the correlation between neighboring markers in the human genome due to linkage disequilibrium (LD) resulting in correlated marker test statistics can be incorporated into multi-marker hypothesis tests, thereby increasing power to detect association. Herein, we quantify the maximum power achievable for multi-marker tests of association in case-control studies, achievable only when the causal marker is known. Using that genotype correlations within an LD block translate into an asymptotically multivariate normal distribution for score test statistics, we develop a set of weights for the markers that maximize the non-centrality parameter, and assess the relative loss of power for other approaches. We find that the method of Conneely and Boehnke (2007) based on the maximum absolute test statistic observed in an LD block is a practical and powerful method in a variety of settings. We also explore the effect on the power that prior biological or functional knowledge used to narrow down the locus of the causal marker can have, and conclude that this prior knowledge has to be very strong and specific for the power to approach the maximum achievable level, or even beat the power observed for methods such as the one proposed by Conneely and Boehnke (2007).http://journal.frontiersin.org/Journal/10.3389/fgene.2013.00252/fullLinkage Disequilibriumgenome-wide association studiesMulti-marker testsMultiplicity adjustmentSingle nucleotide polymorhisms.
spellingShingle Margaret A Taub
Holger R Schwender
Samuel G Younkin
Thomas A Louis
Ingo eRuczinski
On multi-marker tests for association in case-control studies
Frontiers in Genetics
Linkage Disequilibrium
genome-wide association studies
Multi-marker tests
Multiplicity adjustment
Single nucleotide polymorhisms.
title On multi-marker tests for association in case-control studies
title_full On multi-marker tests for association in case-control studies
title_fullStr On multi-marker tests for association in case-control studies
title_full_unstemmed On multi-marker tests for association in case-control studies
title_short On multi-marker tests for association in case-control studies
title_sort on multi marker tests for association in case control studies
topic Linkage Disequilibrium
genome-wide association studies
Multi-marker tests
Multiplicity adjustment
Single nucleotide polymorhisms.
url http://journal.frontiersin.org/Journal/10.3389/fgene.2013.00252/full
work_keys_str_mv AT margaretataub onmultimarkertestsforassociationincasecontrolstudies
AT holgerrschwender onmultimarkertestsforassociationincasecontrolstudies
AT samuelgyounkin onmultimarkertestsforassociationincasecontrolstudies
AT thomasalouis onmultimarkertestsforassociationincasecontrolstudies
AT ingoeruczinski onmultimarkertestsforassociationincasecontrolstudies