graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data

Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand the functional mechanisms underlying many associated variants. This is especially the case when we are...

Full description

Bibliographic Details
Main Authors: Qiaolan Deng, Arkobrato Gupta, Hyeongseon Jeon, Jin Hyun Nam, Ayse Selen Yilmaz, Won Chang, Maciej Pietrzak, Lang Li, Hang J. Kim, Dongjun Chung
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-07-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2023.1079198/full
_version_ 1797782096509403136
author Qiaolan Deng
Arkobrato Gupta
Hyeongseon Jeon
Hyeongseon Jeon
Jin Hyun Nam
Ayse Selen Yilmaz
Won Chang
Maciej Pietrzak
Lang Li
Hang J. Kim
Dongjun Chung
Dongjun Chung
author_facet Qiaolan Deng
Arkobrato Gupta
Hyeongseon Jeon
Hyeongseon Jeon
Jin Hyun Nam
Ayse Selen Yilmaz
Won Chang
Maciej Pietrzak
Lang Li
Hang J. Kim
Dongjun Chung
Dongjun Chung
author_sort Qiaolan Deng
collection DOAJ
description Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand the functional mechanisms underlying many associated variants. This is especially the case when we are interested in variants shared across multiple phenotypes. To address this challenge, we propose graph-GPA 2.0 (GGPA 2.0), a statistical framework to integrate GWAS datasets for multiple phenotypes and incorporate functional annotations within a unified framework. Our simulation studies showed that incorporating functional annotation data using GGPA 2.0 not only improves the detection of disease-associated variants, but also provides a more accurate estimation of relationships among diseases. Next, we analyzed five autoimmune diseases and five psychiatric disorders with the functional annotations derived from GenoSkyline and GenoSkyline-Plus, along with the prior disease graph generated by biomedical literature mining. For autoimmune diseases, GGPA 2.0 identified enrichment for blood-related epigenetic marks, especially B cells and regulatory T cells, across multiple diseases. Psychiatric disorders were enriched for brain-related epigenetic marks, especially the prefrontal cortex and the inferior temporal lobe for bipolar disorder and schizophrenia, respectively. In addition, the pleiotropy between bipolar disorder and schizophrenia was also detected. Finally, we found that GGPA 2.0 is robust to the use of irrelevant and/or incorrect functional annotations. These results demonstrate that GGPA 2.0 can be a powerful tool to identify genetic variants associated with each phenotype or those shared across multiple phenotypes, while also promoting an understanding of functional mechanisms underlying the associated variants.
first_indexed 2024-03-13T00:06:13Z
format Article
id doaj.art-d3df1ef2f8df4e77bedbd48f98b49c79
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-03-13T00:06:13Z
publishDate 2023-07-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-d3df1ef2f8df4e77bedbd48f98b49c792023-07-13T03:44:00ZengFrontiers Media S.A.Frontiers in Genetics1664-80212023-07-011410.3389/fgene.2023.10791981079198graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation dataQiaolan Deng0Arkobrato Gupta1Hyeongseon Jeon2Hyeongseon Jeon3Jin Hyun Nam4Ayse Selen Yilmaz5Won Chang6Maciej Pietrzak7Lang Li8Hang J. Kim9Dongjun Chung10Dongjun Chung11The Interdisciplinary PhD Program in Biostatistics, The Ohio State University, Columbus, OH, United StatesThe Interdisciplinary PhD Program in Biostatistics, The Ohio State University, Columbus, OH, United StatesDepartment of Biomedical Informatics, The Ohio State University, Columbus, OH, United StatesPelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH, United StatesDivision of Big Data Science, Korea University Sejong Campus, Sejong, Republic of KoreaDepartment of Biomedical Informatics, The Ohio State University, Columbus, OH, United StatesDivision of Statistics and Data Science, University of Cincinnati, Cincinnati, OH, United StatesDepartment of Biomedical Informatics, The Ohio State University, Columbus, OH, United StatesDepartment of Biomedical Informatics, The Ohio State University, Columbus, OH, United StatesDivision of Statistics and Data Science, University of Cincinnati, Cincinnati, OH, United StatesDepartment of Biomedical Informatics, The Ohio State University, Columbus, OH, United StatesPelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH, United StatesGenome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand the functional mechanisms underlying many associated variants. This is especially the case when we are interested in variants shared across multiple phenotypes. To address this challenge, we propose graph-GPA 2.0 (GGPA 2.0), a statistical framework to integrate GWAS datasets for multiple phenotypes and incorporate functional annotations within a unified framework. Our simulation studies showed that incorporating functional annotation data using GGPA 2.0 not only improves the detection of disease-associated variants, but also provides a more accurate estimation of relationships among diseases. Next, we analyzed five autoimmune diseases and five psychiatric disorders with the functional annotations derived from GenoSkyline and GenoSkyline-Plus, along with the prior disease graph generated by biomedical literature mining. For autoimmune diseases, GGPA 2.0 identified enrichment for blood-related epigenetic marks, especially B cells and regulatory T cells, across multiple diseases. Psychiatric disorders were enriched for brain-related epigenetic marks, especially the prefrontal cortex and the inferior temporal lobe for bipolar disorder and schizophrenia, respectively. In addition, the pleiotropy between bipolar disorder and schizophrenia was also detected. Finally, we found that GGPA 2.0 is robust to the use of irrelevant and/or incorrect functional annotations. These results demonstrate that GGPA 2.0 can be a powerful tool to identify genetic variants associated with each phenotype or those shared across multiple phenotypes, while also promoting an understanding of functional mechanisms underlying the associated variants.https://www.frontiersin.org/articles/10.3389/fgene.2023.1079198/fullgenome-wide association studiesGWAS summary statisticscomplex traitsgenetic correlationfunctional annotation
spellingShingle Qiaolan Deng
Arkobrato Gupta
Hyeongseon Jeon
Hyeongseon Jeon
Jin Hyun Nam
Ayse Selen Yilmaz
Won Chang
Maciej Pietrzak
Lang Li
Hang J. Kim
Dongjun Chung
Dongjun Chung
graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data
Frontiers in Genetics
genome-wide association studies
GWAS summary statistics
complex traits
genetic correlation
functional annotation
title graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data
title_full graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data
title_fullStr graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data
title_full_unstemmed graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data
title_short graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data
title_sort graph gpa 2 0 improving multi disease genetic analysis with integration of functional annotation data
topic genome-wide association studies
GWAS summary statistics
complex traits
genetic correlation
functional annotation
url https://www.frontiersin.org/articles/10.3389/fgene.2023.1079198/full
work_keys_str_mv AT qiaolandeng graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT arkobratogupta graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT hyeongseonjeon graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT hyeongseonjeon graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT jinhyunnam graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT ayseselenyilmaz graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT wonchang graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT maciejpietrzak graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT langli graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT hangjkim graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT dongjunchung graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata
AT dongjunchung graphgpa20improvingmultidiseasegeneticanalysiswithintegrationoffunctionalannotationdata