The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond

Gene regulation is orchestrated by a vast number of molecules, including transcription factors and co-factors, chromatin regulators, as well as epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a ple...

Full description

Bibliographic Details
Main Authors: Michael Banf, Thomas Hartwig
Format: Article
Language:English
Published: MDPI AG 2021-12-01
Series:Computation
Subjects:
Online Access:https://www.mdpi.com/2079-3197/9/12/146
_version_ 1797505728826572800
author Michael Banf
Thomas Hartwig
author_facet Michael Banf
Thomas Hartwig
author_sort Michael Banf
collection DOAJ
description Gene regulation is orchestrated by a vast number of molecules, including transcription factors and co-factors, chromatin regulators, as well as epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a plethora of diseases, including cancer, developmental or neurological disorders. As a consequence, decoding the architecture of gene regulatory networks has become one of the most important tasks in modern (computational) biology. However, to advance our understanding of the mechanisms involved in the transcriptional apparatus, we need scalable approaches that can deal with the increasing number of large-scale, high-resolution, biological datasets. In particular, such approaches need to be capable of efficiently integrating and exploiting the biological and technological heterogeneity of such datasets in order to best infer the underlying, highly dynamic regulatory networks, often in the absence of sufficient ground truth data for model training or testing. With respect to scalability, randomized approaches have proven to be a promising alternative to deterministic methods in computational biology. As an example, one of the top performing algorithms in a community challenge on gene regulatory network inference from transcriptomic data is based on a random forest regression model. In this concise survey, we aim to highlight how randomized methods may serve as a highly valuable tool, in particular, with increasing amounts of large-scale, biological experiments and datasets being collected. Given the complexity and interdisciplinary nature of the gene regulatory network inference problem, we hope our survey maybe helpful to both computational and biological scientists. It is our aim to provide a starting point for a dialogue about the concepts, benefits, and caveats of the toolbox of randomized methods, since unravelling the intricate web of highly dynamic, regulatory events will be one fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases.
first_indexed 2024-03-10T04:22:51Z
format Article
id doaj.art-6593ea0f9f37429888b3a2ce253cec55
institution Directory Open Access Journal
issn 2079-3197
language English
last_indexed 2024-03-10T04:22:51Z
publishDate 2021-12-01
publisher MDPI AG
record_format Article
series Computation
spelling doaj.art-6593ea0f9f37429888b3a2ce253cec552023-11-23T07:46:34ZengMDPI AGComputation2079-31972021-12-0191214610.3390/computation9120146The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and BeyondMichael Banf0Thomas Hartwig1EducatedGuess.ai, 57290 Neunkirchen, GermanyMax Planck Institute for Plant Breeding Research, 50829 Cologne, GermanyGene regulation is orchestrated by a vast number of molecules, including transcription factors and co-factors, chromatin regulators, as well as epigenetic mechanisms, and it has been shown that transcriptional misregulation, e.g., caused by mutations in regulatory sequences, is responsible for a plethora of diseases, including cancer, developmental or neurological disorders. As a consequence, decoding the architecture of gene regulatory networks has become one of the most important tasks in modern (computational) biology. However, to advance our understanding of the mechanisms involved in the transcriptional apparatus, we need scalable approaches that can deal with the increasing number of large-scale, high-resolution, biological datasets. In particular, such approaches need to be capable of efficiently integrating and exploiting the biological and technological heterogeneity of such datasets in order to best infer the underlying, highly dynamic regulatory networks, often in the absence of sufficient ground truth data for model training or testing. With respect to scalability, randomized approaches have proven to be a promising alternative to deterministic methods in computational biology. As an example, one of the top performing algorithms in a community challenge on gene regulatory network inference from transcriptomic data is based on a random forest regression model. In this concise survey, we aim to highlight how randomized methods may serve as a highly valuable tool, in particular, with increasing amounts of large-scale, biological experiments and datasets being collected. Given the complexity and interdisciplinary nature of the gene regulatory network inference problem, we hope our survey maybe helpful to both computational and biological scientists. It is our aim to provide a starting point for a dialogue about the concepts, benefits, and caveats of the toolbox of randomized methods, since unravelling the intricate web of highly dynamic, regulatory events will be one fundamental step in understanding the mechanisms of life and eventually developing efficient therapies to treat and cure diseases.https://www.mdpi.com/2079-3197/9/12/146scalable gene regulatory network inferencerandomized algorithmsmulti-omics data integration
spellingShingle Michael Banf
Thomas Hartwig
The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
Computation
scalable gene regulatory network inference
randomized algorithms
multi-omics data integration
title The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
title_full The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
title_fullStr The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
title_full_unstemmed The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
title_short The Reasonable Effectiveness of Randomness in Scalable and Integrative Gene Regulatory Network Inference and Beyond
title_sort reasonable effectiveness of randomness in scalable and integrative gene regulatory network inference and beyond
topic scalable gene regulatory network inference
randomized algorithms
multi-omics data integration
url https://www.mdpi.com/2079-3197/9/12/146
work_keys_str_mv AT michaelbanf thereasonableeffectivenessofrandomnessinscalableandintegrativegeneregulatorynetworkinferenceandbeyond
AT thomashartwig thereasonableeffectivenessofrandomnessinscalableandintegrativegeneregulatorynetworkinferenceandbeyond
AT michaelbanf reasonableeffectivenessofrandomnessinscalableandintegrativegeneregulatorynetworkinferenceandbeyond
AT thomashartwig reasonableeffectivenessofrandomnessinscalableandintegrativegeneregulatorynetworkinferenceandbeyond