Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.
Biomedical research studies have generated large multi-omic datasets to study complex diseases like Alzheimer's disease (AD). An important aim of these studies is the identification of candidate genes that demonstrate congruent disease-related alterations across the different data types measure...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2020-04-01
|
Series: | PLoS Computational Biology |
Online Access: | https://doi.org/10.1371/journal.pcbi.1007771 |
_version_ | 1819260237190791168 |
---|---|
author | Hans-Ulrich Klein Martin Schäfer David A Bennett Holger Schwender Philip L De Jager |
author_facet | Hans-Ulrich Klein Martin Schäfer David A Bennett Holger Schwender Philip L De Jager |
author_sort | Hans-Ulrich Klein |
collection | DOAJ |
description | Biomedical research studies have generated large multi-omic datasets to study complex diseases like Alzheimer's disease (AD). An important aim of these studies is the identification of candidate genes that demonstrate congruent disease-related alterations across the different data types measured by the study. We developed a new method to detect such candidate genes in large multi-omic case-control studies that measure multiple data types in the same set of samples. The method is based on a gene-centric integrative coefficient quantifying to what degree consistent differences are observed in the different data types. For statistical inference, a Bayesian hierarchical model is used to study the distribution of the integrative coefficient. The model employs a conditional autoregressive prior to integrate a functional gene network and to share information between genes known to be functionally related. We applied the method to an AD dataset consisting of histone acetylation, DNA methylation, and RNA transcription data from human cortical tissue samples of 233 subjects, and we detected 816 genes with consistent differences between persons with AD and controls. The findings were validated in protein data and in RNA transcription data from two independent AD studies. Finally, we found three subnetworks of jointly dysregulated genes within the functional gene network which capture three distinct biological processes: myeloid cell differentiation, protein phosphorylation and synaptic signaling. Further investigation of the myeloid network indicated an upregulation of this network in early stages of AD prior to accumulation of hyperphosphorylated tau and suggested that increased CSF1 transcription in astrocytes may contribute to microglial activation in AD. Thus, we developed a method that integrates multiple data types and external knowledge of gene function to detect candidate genes, applied the method to an AD dataset, and identified several disease-related genes and processes demonstrating the usefulness of the integrative approach. |
first_indexed | 2024-12-23T19:22:43Z |
format | Article |
id | doaj.art-800100dcaf2145998bc4aad6526f68fe |
institution | Directory Open Access Journal |
issn | 1553-734X 1553-7358 |
language | English |
last_indexed | 2024-12-23T19:22:43Z |
publishDate | 2020-04-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Computational Biology |
spelling | doaj.art-800100dcaf2145998bc4aad6526f68fe2022-12-21T17:34:06ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582020-04-01164e100777110.1371/journal.pcbi.1007771Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.Hans-Ulrich KleinMartin SchäferDavid A BennettHolger SchwenderPhilip L De JagerBiomedical research studies have generated large multi-omic datasets to study complex diseases like Alzheimer's disease (AD). An important aim of these studies is the identification of candidate genes that demonstrate congruent disease-related alterations across the different data types measured by the study. We developed a new method to detect such candidate genes in large multi-omic case-control studies that measure multiple data types in the same set of samples. The method is based on a gene-centric integrative coefficient quantifying to what degree consistent differences are observed in the different data types. For statistical inference, a Bayesian hierarchical model is used to study the distribution of the integrative coefficient. The model employs a conditional autoregressive prior to integrate a functional gene network and to share information between genes known to be functionally related. We applied the method to an AD dataset consisting of histone acetylation, DNA methylation, and RNA transcription data from human cortical tissue samples of 233 subjects, and we detected 816 genes with consistent differences between persons with AD and controls. The findings were validated in protein data and in RNA transcription data from two independent AD studies. Finally, we found three subnetworks of jointly dysregulated genes within the functional gene network which capture three distinct biological processes: myeloid cell differentiation, protein phosphorylation and synaptic signaling. Further investigation of the myeloid network indicated an upregulation of this network in early stages of AD prior to accumulation of hyperphosphorylated tau and suggested that increased CSF1 transcription in astrocytes may contribute to microglial activation in AD. Thus, we developed a method that integrates multiple data types and external knowledge of gene function to detect candidate genes, applied the method to an AD dataset, and identified several disease-related genes and processes demonstrating the usefulness of the integrative approach.https://doi.org/10.1371/journal.pcbi.1007771 |
spellingShingle | Hans-Ulrich Klein Martin Schäfer David A Bennett Holger Schwender Philip L De Jager Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks. PLoS Computational Biology |
title | Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks. |
title_full | Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks. |
title_fullStr | Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks. |
title_full_unstemmed | Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks. |
title_short | Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks. |
title_sort | bayesian integrative analysis of epigenomic and transcriptomic data identifies alzheimer s disease candidate genes and networks |
url | https://doi.org/10.1371/journal.pcbi.1007771 |
work_keys_str_mv | AT hansulrichklein bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks AT martinschafer bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks AT davidabennett bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks AT holgerschwender bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks AT philipldejager bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks |