A reproducible approach to high-throughput biological data acquisition and integration

Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker dis...

Full description

Bibliographic Details
Main Authors: Daniela Börnigen, Yo Sup Moon, Gholamali Rahnavard, Levi Waldron, Lauren McIver, Afrah Shafquat, Eric A. Franzosa, Larissa Miropolsky, Christopher Sweeney, Xochitl C. Morgan, Wendy S. Garrett, Curtis Huttenhower
Format: Article
Language:English
Published: PeerJ Inc. 2015-03-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/791.pdf
_version_ 1797424233329983488
author Daniela Börnigen
Yo Sup Moon
Gholamali Rahnavard
Levi Waldron
Lauren McIver
Afrah Shafquat
Eric A. Franzosa
Larissa Miropolsky
Christopher Sweeney
Xochitl C. Morgan
Wendy S. Garrett
Curtis Huttenhower
author_facet Daniela Börnigen
Yo Sup Moon
Gholamali Rahnavard
Levi Waldron
Lauren McIver
Afrah Shafquat
Eric A. Franzosa
Larissa Miropolsky
Christopher Sweeney
Xochitl C. Morgan
Wendy S. Garrett
Curtis Huttenhower
author_sort Daniela Börnigen
collection DOAJ
description Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming and error-prone. To enable efficient and reproducible integration of diverse experimental results, we developed a novel approach for standardized acquisition and analysis of high-throughput and heterogeneous biological data. This allowed, first, novel biomolecular network reconstruction in human prostate cancer, which correctly recovered and extended the NFκB signaling pathway. Next, we investigated host-microbiome interactions. In less than an hour of analysis time, the system retrieved data and integrated six germ-free murine intestinal gene expression datasets to identify the genes most influenced by the gut microbiota, which comprised a set of immune-response and carbohydrate metabolism processes. Finally, we constructed integrated functional interaction networks to compare connectivity of peptide secretion pathways in the model organisms Escherichia coli, Bacillus subtilis, and Pseudomonas aeruginosa.
first_indexed 2024-03-09T07:59:22Z
format Article
id doaj.art-2af1b205feed4303977b54e615be3277
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T07:59:22Z
publishDate 2015-03-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-2af1b205feed4303977b54e615be32772023-12-03T00:49:02ZengPeerJ Inc.PeerJ2167-83592015-03-013e79110.7717/peerj.791791A reproducible approach to high-throughput biological data acquisition and integrationDaniela Börnigen0Yo Sup Moon1Gholamali Rahnavard2Levi Waldron3Lauren McIver4Afrah Shafquat5Eric A. Franzosa6Larissa Miropolsky7Christopher Sweeney8Xochitl C. Morgan9Wendy S. Garrett10Curtis Huttenhower11Biostatistics Department, Harvard School of Public Health, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USADana-Farber Cancer Institute, Boston, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USAThe Broad Institute of MIT and Harvard, Cambridge, MA, USABiostatistics Department, Harvard School of Public Health, Boston, MA, USAModern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming and error-prone. To enable efficient and reproducible integration of diverse experimental results, we developed a novel approach for standardized acquisition and analysis of high-throughput and heterogeneous biological data. This allowed, first, novel biomolecular network reconstruction in human prostate cancer, which correctly recovered and extended the NFκB signaling pathway. Next, we investigated host-microbiome interactions. In less than an hour of analysis time, the system retrieved data and integrated six germ-free murine intestinal gene expression datasets to identify the genes most influenced by the gut microbiota, which comprised a set of immune-response and carbohydrate metabolism processes. Finally, we constructed integrated functional interaction networks to compare connectivity of peptide secretion pathways in the model organisms Escherichia coli, Bacillus subtilis, and Pseudomonas aeruginosa.https://peerj.com/articles/791.pdfHigh-throughput dataData integrationData acquisitionMeta-analysisHeterogeneous dataReproducibility
spellingShingle Daniela Börnigen
Yo Sup Moon
Gholamali Rahnavard
Levi Waldron
Lauren McIver
Afrah Shafquat
Eric A. Franzosa
Larissa Miropolsky
Christopher Sweeney
Xochitl C. Morgan
Wendy S. Garrett
Curtis Huttenhower
A reproducible approach to high-throughput biological data acquisition and integration
PeerJ
High-throughput data
Data integration
Data acquisition
Meta-analysis
Heterogeneous data
Reproducibility
title A reproducible approach to high-throughput biological data acquisition and integration
title_full A reproducible approach to high-throughput biological data acquisition and integration
title_fullStr A reproducible approach to high-throughput biological data acquisition and integration
title_full_unstemmed A reproducible approach to high-throughput biological data acquisition and integration
title_short A reproducible approach to high-throughput biological data acquisition and integration
title_sort reproducible approach to high throughput biological data acquisition and integration
topic High-throughput data
Data integration
Data acquisition
Meta-analysis
Heterogeneous data
Reproducibility
url https://peerj.com/articles/791.pdf
work_keys_str_mv AT danielabornigen areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT yosupmoon areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT gholamalirahnavard areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT leviwaldron areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT laurenmciver areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT afrahshafquat areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT ericafranzosa areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT larissamiropolsky areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT christophersweeney areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT xochitlcmorgan areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT wendysgarrett areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT curtishuttenhower areproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT danielabornigen reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT yosupmoon reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT gholamalirahnavard reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT leviwaldron reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT laurenmciver reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT afrahshafquat reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT ericafranzosa reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT larissamiropolsky reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT christophersweeney reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT xochitlcmorgan reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT wendysgarrett reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration
AT curtishuttenhower reproducibleapproachtohighthroughputbiologicaldataacquisitionandintegration