Efficient Bayesian mixed-model analysis increases association power in large cohorts
Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts and may not optimize power. All existing methods require time cost O(MN2) (where N is the number of samples an...
Main Authors: | , , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Nature Publishing Group
2017
|
Online Access: | http://hdl.handle.net/1721.1/110185 https://orcid.org/0000-0003-3864-9828 https://orcid.org/0000-0002-2724-7228 |
_version_ | 1826208367857631232 |
---|---|
author | Loh, Po-Ru Bulik-Sullivan, Brendan K Vilhjálmsson, Bjarni J Salem, Rany M Chasman, Daniel I Ridker, Paul M Neale, Benjamin M Patterson, Nick Price, Alkes L Tucker, George Jay Finucane, Hilary Kiyo Berger Leighton, Bonnie |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Loh, Po-Ru Bulik-Sullivan, Brendan K Vilhjálmsson, Bjarni J Salem, Rany M Chasman, Daniel I Ridker, Paul M Neale, Benjamin M Patterson, Nick Price, Alkes L Tucker, George Jay Finucane, Hilary Kiyo Berger Leighton, Bonnie |
author_sort | Loh, Po-Ru |
collection | MIT |
description | Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts and may not optimize power. All existing methods require time cost O(MN2) (where N is the number of samples and M is the number of SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here we present a far more efficient mixed-model association method, BOLT-LMM, which requires only a small number of O(MN) time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to 9 quantitative traits in 23,294 samples from the Women's Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for genome-wide association studies in large cohorts. |
first_indexed | 2024-09-23T14:04:39Z |
format | Article |
id | mit-1721.1/110185 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T14:04:39Z |
publishDate | 2017 |
publisher | Nature Publishing Group |
record_format | dspace |
spelling | mit-1721.1/1101852022-09-28T18:13:30Z Efficient Bayesian mixed-model analysis increases association power in large cohorts Loh, Po-Ru Bulik-Sullivan, Brendan K Vilhjálmsson, Bjarni J Salem, Rany M Chasman, Daniel I Ridker, Paul M Neale, Benjamin M Patterson, Nick Price, Alkes L Tucker, George Jay Finucane, Hilary Kiyo Berger Leighton, Bonnie Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Mathematics Tucker, George Jay Finucane, Hilary Kiyo Berger Leighton, Bonnie Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts and may not optimize power. All existing methods require time cost O(MN2) (where N is the number of samples and M is the number of SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here we present a far more efficient mixed-model association method, BOLT-LMM, which requires only a small number of O(MN) time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to 9 quantitative traits in 23,294 samples from the Women's Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for genome-wide association studies in large cohorts. National Institutes of Health (U.S.) (grant R01 HG006399) National Institutes of Health (U.S.) (fellowship F32 HG007805) Hertz Foundation 2017-06-22T21:32:59Z 2017-06-22T21:32:59Z 2015-03 2014-07 Article http://purl.org/eprint/type/JournalArticle 1061-4036 1546-1718 http://hdl.handle.net/1721.1/110185 Loh, Po-Ru, George Tucker, Brendan K Bulik-Sullivan, Bjarni J Vilhjálmsson, Hilary K Finucane, Rany M Salem, Daniel I Chasman, et al. “Efficient Bayesian Mixed-Model Analysis Increases Association Power in Large Cohorts.” Nat Genet 47, no. 3 (February 2, 2015): 284–290. © 2015 Nature America, Inc. https://orcid.org/0000-0003-3864-9828 https://orcid.org/0000-0002-2724-7228 en_US http://dx.doi.org/10.1038/ng.3190 Nature Genetics Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Nature Publishing Group PMC |
spellingShingle | Loh, Po-Ru Bulik-Sullivan, Brendan K Vilhjálmsson, Bjarni J Salem, Rany M Chasman, Daniel I Ridker, Paul M Neale, Benjamin M Patterson, Nick Price, Alkes L Tucker, George Jay Finucane, Hilary Kiyo Berger Leighton, Bonnie Efficient Bayesian mixed-model analysis increases association power in large cohorts |
title | Efficient Bayesian mixed-model analysis increases association power in large cohorts |
title_full | Efficient Bayesian mixed-model analysis increases association power in large cohorts |
title_fullStr | Efficient Bayesian mixed-model analysis increases association power in large cohorts |
title_full_unstemmed | Efficient Bayesian mixed-model analysis increases association power in large cohorts |
title_short | Efficient Bayesian mixed-model analysis increases association power in large cohorts |
title_sort | efficient bayesian mixed model analysis increases association power in large cohorts |
url | http://hdl.handle.net/1721.1/110185 https://orcid.org/0000-0003-3864-9828 https://orcid.org/0000-0002-2724-7228 |
work_keys_str_mv | AT lohporu efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT buliksullivanbrendank efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT vilhjalmssonbjarnij efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT salemranym efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT chasmandanieli efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT ridkerpaulm efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT nealebenjaminm efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT pattersonnick efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT pricealkesl efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT tuckergeorgejay efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT finucanehilarykiyo efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts AT bergerleightonbonnie efficientbayesianmixedmodelanalysisincreasesassociationpowerinlargecohorts |