Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits
The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R2...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-05-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2023.1150889/full |
_version_ | 1797830934870884352 |
---|---|
author | Hyein Jung Hae-Un Jung Eun Ju Baek Ju Yeon Chung Shin Young Kwon Ji-One Kang Ji Eun Lim Bermseok Oh Bermseok Oh Bermseok Oh |
author_facet | Hyein Jung Hae-Un Jung Eun Ju Baek Ju Yeon Chung Shin Young Kwon Ji-One Kang Ji Eun Lim Bermseok Oh Bermseok Oh Bermseok Oh |
author_sort | Hyein Jung |
collection | DOAJ |
description | The polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R2 value. One of the key assumptions of linear regression is that the variance of the residual should be constant at each level of the predictor variables, called homoscedasticity. However, some studies show that PRS models exhibit heteroscedasticity between PRS and traits. This study analyzes whether heteroscedasticity exists in PRS models of diverse disease-related traits and, if any, it affects the accuracy of PRS-based prediction in 354,761 Europeans from the UK Biobank. We constructed PRSs for 15 quantitative traits using LDpred2 and estimated the existence of heteroscedasticity between PRSs and 15 traits using three different tests of the Breusch-Pagan (BP) test, score test, and F test. Thirteen out of fifteen traits show significant heteroscedasticity. Further replication using new PRSs from the PGS catalog and independent samples (N = 23,620) from the UK Biobank confirmed the heteroscedasticity in ten traits. As a result, ten out of fifteen quantitative traits show statistically significant heteroscedasticity between the PRS and each trait. There was a greater variance of residuals as PRS increased, and the prediction accuracy at each level of PRS tended to decrease as the variance of residuals increased. In conclusion, heteroscedasticity was frequently observed in the PRS-based prediction models of quantitative traits, and the accuracy of the predictive model may differ according to PRS values. Therefore, prediction models using the PRS should be constructed by considering heteroscedasticity. |
first_indexed | 2024-04-09T13:44:55Z |
format | Article |
id | doaj.art-b039f4e167a74594bac6030410bc703d |
institution | Directory Open Access Journal |
issn | 1664-8021 |
language | English |
last_indexed | 2024-04-09T13:44:55Z |
publishDate | 2023-05-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Genetics |
spelling | doaj.art-b039f4e167a74594bac6030410bc703d2023-05-09T05:38:04ZengFrontiers Media S.A.Frontiers in Genetics1664-80212023-05-011410.3389/fgene.2023.11508891150889Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traitsHyein Jung0Hae-Un Jung1Eun Ju Baek2Ju Yeon Chung3Shin Young Kwon4Ji-One Kang5Ji Eun Lim6Bermseok Oh7Bermseok Oh8Bermseok Oh9Department of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of KoreaDepartment of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of KoreaMendel, Seoul, Republic of KoreaDepartment of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of KoreaDepartment of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of KoreaDepartment of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of KoreaDepartment of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of KoreaDepartment of Biomedical Science, Graduate School, Kyung Hee University, Seoul, Republic of KoreaMendel, Seoul, Republic of KoreaDepartment of Biochemistry and Molecular Biology, School of Medicine, Kyung Hee University, Seoul, Republic of KoreaThe polygenic risk score (PRS) could be used to stratify individuals with high risk of diseases and predict complex trait of individual in a population. Previous studies developed a PRS-based prediction model using linear regression and evaluated the predictive performance of the model using the R2 value. One of the key assumptions of linear regression is that the variance of the residual should be constant at each level of the predictor variables, called homoscedasticity. However, some studies show that PRS models exhibit heteroscedasticity between PRS and traits. This study analyzes whether heteroscedasticity exists in PRS models of diverse disease-related traits and, if any, it affects the accuracy of PRS-based prediction in 354,761 Europeans from the UK Biobank. We constructed PRSs for 15 quantitative traits using LDpred2 and estimated the existence of heteroscedasticity between PRSs and 15 traits using three different tests of the Breusch-Pagan (BP) test, score test, and F test. Thirteen out of fifteen traits show significant heteroscedasticity. Further replication using new PRSs from the PGS catalog and independent samples (N = 23,620) from the UK Biobank confirmed the heteroscedasticity in ten traits. As a result, ten out of fifteen quantitative traits show statistically significant heteroscedasticity between the PRS and each trait. There was a greater variance of residuals as PRS increased, and the prediction accuracy at each level of PRS tended to decrease as the variance of residuals increased. In conclusion, heteroscedasticity was frequently observed in the PRS-based prediction models of quantitative traits, and the accuracy of the predictive model may differ according to PRS values. Therefore, prediction models using the PRS should be constructed by considering heteroscedasticity.https://www.frontiersin.org/articles/10.3389/fgene.2023.1150889/fullpolygenic risk score (PRS)linear regression modelquantitative traitprediction accuracyheteroscedasticity |
spellingShingle | Hyein Jung Hae-Un Jung Eun Ju Baek Ju Yeon Chung Shin Young Kwon Ji-One Kang Ji Eun Lim Bermseok Oh Bermseok Oh Bermseok Oh Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits Frontiers in Genetics polygenic risk score (PRS) linear regression model quantitative trait prediction accuracy heteroscedasticity |
title | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_full | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_fullStr | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_full_unstemmed | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_short | Investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
title_sort | investigation of heteroscedasticity in polygenic risk scores across 15 quantitative traits |
topic | polygenic risk score (PRS) linear regression model quantitative trait prediction accuracy heteroscedasticity |
url | https://www.frontiersin.org/articles/10.3389/fgene.2023.1150889/full |
work_keys_str_mv | AT hyeinjung investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT haeunjung investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT eunjubaek investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT juyeonchung investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT shinyoungkwon investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT jionekang investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT jieunlim investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT bermseokoh investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT bermseokoh investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits AT bermseokoh investigationofheteroscedasticityinpolygenicriskscoresacross15quantitativetraits |