Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.

Biomedical research studies have generated large multi-omic datasets to study complex diseases like Alzheimer's disease (AD). An important aim of these studies is the identification of candidate genes that demonstrate congruent disease-related alterations across the different data types measure...

Full description

Bibliographic Details
Main Authors: Hans-Ulrich Klein, Martin Schäfer, David A Bennett, Holger Schwender, Philip L De Jager
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-04-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1007771
_version_ 1819260237190791168
author Hans-Ulrich Klein
Martin Schäfer
David A Bennett
Holger Schwender
Philip L De Jager
author_facet Hans-Ulrich Klein
Martin Schäfer
David A Bennett
Holger Schwender
Philip L De Jager
author_sort Hans-Ulrich Klein
collection DOAJ
description Biomedical research studies have generated large multi-omic datasets to study complex diseases like Alzheimer's disease (AD). An important aim of these studies is the identification of candidate genes that demonstrate congruent disease-related alterations across the different data types measured by the study. We developed a new method to detect such candidate genes in large multi-omic case-control studies that measure multiple data types in the same set of samples. The method is based on a gene-centric integrative coefficient quantifying to what degree consistent differences are observed in the different data types. For statistical inference, a Bayesian hierarchical model is used to study the distribution of the integrative coefficient. The model employs a conditional autoregressive prior to integrate a functional gene network and to share information between genes known to be functionally related. We applied the method to an AD dataset consisting of histone acetylation, DNA methylation, and RNA transcription data from human cortical tissue samples of 233 subjects, and we detected 816 genes with consistent differences between persons with AD and controls. The findings were validated in protein data and in RNA transcription data from two independent AD studies. Finally, we found three subnetworks of jointly dysregulated genes within the functional gene network which capture three distinct biological processes: myeloid cell differentiation, protein phosphorylation and synaptic signaling. Further investigation of the myeloid network indicated an upregulation of this network in early stages of AD prior to accumulation of hyperphosphorylated tau and suggested that increased CSF1 transcription in astrocytes may contribute to microglial activation in AD. Thus, we developed a method that integrates multiple data types and external knowledge of gene function to detect candidate genes, applied the method to an AD dataset, and identified several disease-related genes and processes demonstrating the usefulness of the integrative approach.
first_indexed 2024-12-23T19:22:43Z
format Article
id doaj.art-800100dcaf2145998bc4aad6526f68fe
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-12-23T19:22:43Z
publishDate 2020-04-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-800100dcaf2145998bc4aad6526f68fe2022-12-21T17:34:06ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582020-04-01164e100777110.1371/journal.pcbi.1007771Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.Hans-Ulrich KleinMartin SchäferDavid A BennettHolger SchwenderPhilip L De JagerBiomedical research studies have generated large multi-omic datasets to study complex diseases like Alzheimer's disease (AD). An important aim of these studies is the identification of candidate genes that demonstrate congruent disease-related alterations across the different data types measured by the study. We developed a new method to detect such candidate genes in large multi-omic case-control studies that measure multiple data types in the same set of samples. The method is based on a gene-centric integrative coefficient quantifying to what degree consistent differences are observed in the different data types. For statistical inference, a Bayesian hierarchical model is used to study the distribution of the integrative coefficient. The model employs a conditional autoregressive prior to integrate a functional gene network and to share information between genes known to be functionally related. We applied the method to an AD dataset consisting of histone acetylation, DNA methylation, and RNA transcription data from human cortical tissue samples of 233 subjects, and we detected 816 genes with consistent differences between persons with AD and controls. The findings were validated in protein data and in RNA transcription data from two independent AD studies. Finally, we found three subnetworks of jointly dysregulated genes within the functional gene network which capture three distinct biological processes: myeloid cell differentiation, protein phosphorylation and synaptic signaling. Further investigation of the myeloid network indicated an upregulation of this network in early stages of AD prior to accumulation of hyperphosphorylated tau and suggested that increased CSF1 transcription in astrocytes may contribute to microglial activation in AD. Thus, we developed a method that integrates multiple data types and external knowledge of gene function to detect candidate genes, applied the method to an AD dataset, and identified several disease-related genes and processes demonstrating the usefulness of the integrative approach.https://doi.org/10.1371/journal.pcbi.1007771
spellingShingle Hans-Ulrich Klein
Martin Schäfer
David A Bennett
Holger Schwender
Philip L De Jager
Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.
PLoS Computational Biology
title Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.
title_full Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.
title_fullStr Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.
title_full_unstemmed Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.
title_short Bayesian integrative analysis of epigenomic and transcriptomic data identifies Alzheimer's disease candidate genes and networks.
title_sort bayesian integrative analysis of epigenomic and transcriptomic data identifies alzheimer s disease candidate genes and networks
url https://doi.org/10.1371/journal.pcbi.1007771
work_keys_str_mv AT hansulrichklein bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks
AT martinschafer bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks
AT davidabennett bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks
AT holgerschwender bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks
AT philipldejager bayesianintegrativeanalysisofepigenomicandtranscriptomicdataidentifiesalzheimersdiseasecandidategenesandnetworks