A simulation study investigating power estimates in phenome-wide association studies

Abstract Background Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls fo...

Full description

Bibliographic Details
Main Authors: Anurag Verma, Yuki Bradford, Scott Dudek, Anastasia M. Lucas, Shefali S. Verma, Sarah A. Pendergrass, Marylyn D. Ritchie
Format: Article
Language:English
Published: BMC 2018-04-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-018-2135-0
_version_ 1817997799311015936
author Anurag Verma
Yuki Bradford
Scott Dudek
Anastasia M. Lucas
Shefali S. Verma
Sarah A. Pendergrass
Marylyn D. Ritchie
author_facet Anurag Verma
Yuki Bradford
Scott Dudek
Anastasia M. Lucas
Shefali S. Verma
Sarah A. Pendergrass
Marylyn D. Ritchie
author_sort Anurag Verma
collection DOAJ
description Abstract Background Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls for binary traits across the many phenotypes of interest, which can affect the statistical power to detect associations. The motivation of this study is to investigate the various parameters which affect the estimation of statistical power in PheWAS, including sample size, case-control ratio, minor allele frequency, and disease penetrance. Results We performed a PheWAS simulation study, where we investigated variations in statistical power based on different parameters, such as overall sample size, number of cases, case-control ratio, minor allele frequency, and disease penetrance. The simulation was performed on both binary and quantitative phenotypic measures. Our simulation on binary traits suggests that the number of cases has more impact on statistical power than the case to control ratio; also, we found that a sample size of 200 cases or more maintains the statistical power to identify associations for common variants. For quantitative traits, a sample size of 1000 or more individuals performed best in the power calculations. We focused on common genetic variants (MAF > 0.01) in this study; however, in future studies, we will be extending this effort to perform similar simulations on rare variants. Conclusions This study provides a series of PheWAS simulation analyses that can be used to estimate statistical power for some potential scenarios. These results can be used to provide guidelines for appropriate study design for future PheWAS analyses.
first_indexed 2024-04-14T02:43:37Z
format Article
id doaj.art-9504dcdb14c74f35a8705d8ec0ff5273
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-14T02:43:37Z
publishDate 2018-04-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-9504dcdb14c74f35a8705d8ec0ff52732022-12-22T02:16:39ZengBMCBMC Bioinformatics1471-21052018-04-011911810.1186/s12859-018-2135-0A simulation study investigating power estimates in phenome-wide association studiesAnurag Verma0Yuki Bradford1Scott Dudek2Anastasia M. Lucas3Shefali S. Verma4Sarah A. Pendergrass5Marylyn D. Ritchie6Department of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of MedicineDepartment of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of MedicineDepartment of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of MedicineDepartment of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of MedicineDepartment of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of MedicineBiomedical and Translational InformaticsDepartment of Genetics and Institute for Biomedical Informatics, University of Pennsylvania, Perelman School of MedicineAbstract Background Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls for binary traits across the many phenotypes of interest, which can affect the statistical power to detect associations. The motivation of this study is to investigate the various parameters which affect the estimation of statistical power in PheWAS, including sample size, case-control ratio, minor allele frequency, and disease penetrance. Results We performed a PheWAS simulation study, where we investigated variations in statistical power based on different parameters, such as overall sample size, number of cases, case-control ratio, minor allele frequency, and disease penetrance. The simulation was performed on both binary and quantitative phenotypic measures. Our simulation on binary traits suggests that the number of cases has more impact on statistical power than the case to control ratio; also, we found that a sample size of 200 cases or more maintains the statistical power to identify associations for common variants. For quantitative traits, a sample size of 1000 or more individuals performed best in the power calculations. We focused on common genetic variants (MAF > 0.01) in this study; however, in future studies, we will be extending this effort to perform similar simulations on rare variants. Conclusions This study provides a series of PheWAS simulation analyses that can be used to estimate statistical power for some potential scenarios. These results can be used to provide guidelines for appropriate study design for future PheWAS analyses.http://link.springer.com/article/10.1186/s12859-018-2135-0PheWASEHRICD-9 codesPower analysisSimulation study
spellingShingle Anurag Verma
Yuki Bradford
Scott Dudek
Anastasia M. Lucas
Shefali S. Verma
Sarah A. Pendergrass
Marylyn D. Ritchie
A simulation study investigating power estimates in phenome-wide association studies
BMC Bioinformatics
PheWAS
EHR
ICD-9 codes
Power analysis
Simulation study
title A simulation study investigating power estimates in phenome-wide association studies
title_full A simulation study investigating power estimates in phenome-wide association studies
title_fullStr A simulation study investigating power estimates in phenome-wide association studies
title_full_unstemmed A simulation study investigating power estimates in phenome-wide association studies
title_short A simulation study investigating power estimates in phenome-wide association studies
title_sort simulation study investigating power estimates in phenome wide association studies
topic PheWAS
EHR
ICD-9 codes
Power analysis
Simulation study
url http://link.springer.com/article/10.1186/s12859-018-2135-0
work_keys_str_mv AT anuragverma asimulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT yukibradford asimulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT scottdudek asimulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT anastasiamlucas asimulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT shefalisverma asimulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT sarahapendergrass asimulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT marylyndritchie asimulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT anuragverma simulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT yukibradford simulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT scottdudek simulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT anastasiamlucas simulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT shefalisverma simulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT sarahapendergrass simulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies
AT marylyndritchie simulationstudyinvestigatingpowerestimatesinphenomewideassociationstudies