A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks

Abstract Background Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able t...

Full description

Bibliographic Details
Main Authors: Elisabetta Sauta, Andrea Demartini, Francesca Vitali, Alberto Riva, Riccardo Bellazzi
Format: Article
Language:English
Published: BMC 2020-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-3510-1
_version_ 1818482718099374080
author Elisabetta Sauta
Andrea Demartini
Francesca Vitali
Alberto Riva
Riccardo Bellazzi
author_facet Elisabetta Sauta
Andrea Demartini
Francesca Vitali
Alberto Riva
Riccardo Bellazzi
author_sort Elisabetta Sauta
collection DOAJ
description Abstract Background Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able to handle both the high number of interacting variables and the noise in the available heterogeneous experimental sources of information. Results In this work, we propose a data fusion approach that exploits the integration of complementary omics-data as prior knowledge within a Bayesian framework, in order to learn and model large-scale transcriptional networks. We develop a hybrid structure-learning algorithm able to jointly combine TFs ChIP-Sequencing data and gene expression compendia to reconstruct TRNs in a genome-wide perspective. Applying our method to high-throughput data, we verified its ability to deal with the complexity of a genomic TRN, providing a snapshot of the synergistic TFs regulatory activity. Given the noisy nature of data-driven prior knowledge, which potentially contains incorrect information, we also tested the method’s robustness to false priors on a benchmark dataset, comparing the proposed approach to other regulatory network reconstruction algorithms. We demonstrated the effectiveness of our framework by evaluating structural commonalities of our learned genomic network with other existing networks inferred by different DNA binding information-based methods. Conclusions This Bayesian omics-data fusion based methodology allows to gain a genome-wide picture of the transcriptional interplay, helping to unravel key hierarchical transcriptional interactions, which could be subsequently investigated, and it represents a promising learning approach suitable for multi-layered genomic data integration, given its robustness to noisy sources and its tailored framework for handling high dimensional data.
first_indexed 2024-12-10T11:50:56Z
format Article
id doaj.art-adba881ae43a4eec94faa079be890550
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-10T11:50:56Z
publishDate 2020-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-adba881ae43a4eec94faa079be8905502022-12-22T01:49:56ZengBMCBMC Bioinformatics1471-21052020-05-0121112810.1186/s12859-020-3510-1A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networksElisabetta Sauta0Andrea Demartini1Francesca Vitali2Alberto Riva3Riccardo Bellazzi4Department of Electrical, Computer and Biomedical Engineering, University of PaviaDepartment of Electrical, Computer and Biomedical Engineering, University of PaviaCenter for Biomedical Informatics and Biostatistics, Dept. of Medicine, The University of Arizona Health SciencesBioinformatics Core, Interdisciplinary Center for Biotechnology Research, University of FloridaDepartment of Electrical, Computer and Biomedical Engineering, University of PaviaAbstract Background Reverse engineering of transcriptional regulatory networks (TRN) from genomics data has always represented a computational challenge in System Biology. The major issue is modeling the complex crosstalk among transcription factors (TFs) and their target genes, with a method able to handle both the high number of interacting variables and the noise in the available heterogeneous experimental sources of information. Results In this work, we propose a data fusion approach that exploits the integration of complementary omics-data as prior knowledge within a Bayesian framework, in order to learn and model large-scale transcriptional networks. We develop a hybrid structure-learning algorithm able to jointly combine TFs ChIP-Sequencing data and gene expression compendia to reconstruct TRNs in a genome-wide perspective. Applying our method to high-throughput data, we verified its ability to deal with the complexity of a genomic TRN, providing a snapshot of the synergistic TFs regulatory activity. Given the noisy nature of data-driven prior knowledge, which potentially contains incorrect information, we also tested the method’s robustness to false priors on a benchmark dataset, comparing the proposed approach to other regulatory network reconstruction algorithms. We demonstrated the effectiveness of our framework by evaluating structural commonalities of our learned genomic network with other existing networks inferred by different DNA binding information-based methods. Conclusions This Bayesian omics-data fusion based methodology allows to gain a genome-wide picture of the transcriptional interplay, helping to unravel key hierarchical transcriptional interactions, which could be subsequently investigated, and it represents a promising learning approach suitable for multi-layered genomic data integration, given its robustness to noisy sources and its tailored framework for handling high dimensional data.http://link.springer.com/article/10.1186/s12859-020-3510-1Genomic transcriptional networksomics-data fusionBayesian networksHybrid structure learning algorithm
spellingShingle Elisabetta Sauta
Andrea Demartini
Francesca Vitali
Alberto Riva
Riccardo Bellazzi
A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks
BMC Bioinformatics
Genomic transcriptional networks
omics-data fusion
Bayesian networks
Hybrid structure learning algorithm
title A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks
title_full A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks
title_fullStr A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks
title_full_unstemmed A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks
title_short A Bayesian data fusion based approach for learning genome-wide transcriptional regulatory networks
title_sort bayesian data fusion based approach for learning genome wide transcriptional regulatory networks
topic Genomic transcriptional networks
omics-data fusion
Bayesian networks
Hybrid structure learning algorithm
url http://link.springer.com/article/10.1186/s12859-020-3510-1
work_keys_str_mv AT elisabettasauta abayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT andreademartini abayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT francescavitali abayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT albertoriva abayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT riccardobellazzi abayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT elisabettasauta bayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT andreademartini bayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT francescavitali bayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT albertoriva bayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks
AT riccardobellazzi bayesiandatafusionbasedapproachforlearninggenomewidetranscriptionalregulatorynetworks