A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population.
Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Pris...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2022-10-01
|
Series: | PLoS Genetics |
Online Access: | https://doi.org/10.1371/journal.pgen.1010443 |
_version_ | 1811155422121295872 |
---|---|
author | Xiaoxuan Xia Yexian Zhang Rui Sun Yingying Wei Qi Li Marc Ka Chun Chong William Ka Kei Wu Benny Chung-Ying Zee Hua Tang Maggie Haitian Wang |
author_facet | Xiaoxuan Xia Yexian Zhang Rui Sun Yingying Wei Qi Li Marc Ka Chun Chong William Ka Kei Wu Benny Chung-Ying Zee Hua Tang Maggie Haitian Wang |
author_sort | Xiaoxuan Xia |
collection | DOAJ |
description | Multi-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Prism Vote (PV), to construct risk predictions in heterogeneous genetic data. The PV views the trait of an individual as a composite risk from subpopulations, in which stratum-specific predictors can be formed in data of more homogeneous genetic structure. Since each individual is described by a composition of subpopulation memberships, the framework enables individualized risk characterization. Simulations demonstrated that the PV framework applied with alternative prediction methods significantly improved prediction accuracy in mixed and admixed populations. The advantage of PV enlarges as genetic heterogeneity and sample size increase. In two real genome-wide association data consists of multiple populations, we showed that the framework considerably enhanced prediction accuracy of the linear mixed model in five-group cross validations. The proposed method offers a new aspect to analyze individual's disease risk and improve accuracy for predicting complex traits in genotype data. |
first_indexed | 2024-04-10T04:33:45Z |
format | Article |
id | doaj.art-4b6ab098b18f487a892fbfc60d539e65 |
institution | Directory Open Access Journal |
issn | 1553-7390 1553-7404 |
language | English |
last_indexed | 2024-04-10T04:33:45Z |
publishDate | 2022-10-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Genetics |
spelling | doaj.art-4b6ab098b18f487a892fbfc60d539e652023-03-10T05:31:53ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042022-10-011810e101044310.1371/journal.pgen.1010443A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population.Xiaoxuan XiaYexian ZhangRui SunYingying WeiQi LiMarc Ka Chun ChongWilliam Ka Kei WuBenny Chung-Ying ZeeHua TangMaggie Haitian WangMulti-population cohorts offer unprecedented opportunities for profiling disease risk in large samples, however, heterogeneous risk effects underlying complex traits across populations make integrative prediction challenging. In this study, we propose a novel Bayesian probability framework, the Prism Vote (PV), to construct risk predictions in heterogeneous genetic data. The PV views the trait of an individual as a composite risk from subpopulations, in which stratum-specific predictors can be formed in data of more homogeneous genetic structure. Since each individual is described by a composition of subpopulation memberships, the framework enables individualized risk characterization. Simulations demonstrated that the PV framework applied with alternative prediction methods significantly improved prediction accuracy in mixed and admixed populations. The advantage of PV enlarges as genetic heterogeneity and sample size increase. In two real genome-wide association data consists of multiple populations, we showed that the framework considerably enhanced prediction accuracy of the linear mixed model in five-group cross validations. The proposed method offers a new aspect to analyze individual's disease risk and improve accuracy for predicting complex traits in genotype data.https://doi.org/10.1371/journal.pgen.1010443 |
spellingShingle | Xiaoxuan Xia Yexian Zhang Rui Sun Yingying Wei Qi Li Marc Ka Chun Chong William Ka Kei Wu Benny Chung-Ying Zee Hua Tang Maggie Haitian Wang A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population. PLoS Genetics |
title | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population. |
title_full | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population. |
title_fullStr | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population. |
title_full_unstemmed | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population. |
title_short | A Prism Vote method for individualized risk prediction of traits in genotype data of Multi-population. |
title_sort | prism vote method for individualized risk prediction of traits in genotype data of multi population |
url | https://doi.org/10.1371/journal.pgen.1010443 |
work_keys_str_mv | AT xiaoxuanxia aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT yexianzhang aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT ruisun aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT yingyingwei aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT qili aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT marckachunchong aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT williamkakeiwu aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT bennychungyingzee aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT huatang aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT maggiehaitianwang aprismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT xiaoxuanxia prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT yexianzhang prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT ruisun prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT yingyingwei prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT qili prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT marckachunchong prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT williamkakeiwu prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT bennychungyingzee prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT huatang prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation AT maggiehaitianwang prismvotemethodforindividualizedriskpredictionoftraitsingenotypedataofmultipopulation |