Improved biomarker discovery through a plot twist in transcriptomic data analysis

Abstract Background Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (...

Full description

Bibliographic Details
Main Authors: Núria Sánchez-Baizán, Laia Ribas, Francesc Piferrer
Format: Article
Language:English
Published: BMC 2022-09-01
Series:BMC Biology
Subjects:
Online Access:https://doi.org/10.1186/s12915-022-01398-w
_version_ 1818017091186327552
author Núria Sánchez-Baizán
Laia Ribas
Francesc Piferrer
author_facet Núria Sánchez-Baizán
Laia Ribas
Francesc Piferrer
author_sort Núria Sánchez-Baizán
collection DOAJ
description Abstract Background Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. Results In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. Conclusions We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered.
first_indexed 2024-04-14T07:22:19Z
format Article
id doaj.art-1b380dfaf83f44068e13126c61a55296
institution Directory Open Access Journal
issn 1741-7007
language English
last_indexed 2024-04-14T07:22:19Z
publishDate 2022-09-01
publisher BMC
record_format Article
series BMC Biology
spelling doaj.art-1b380dfaf83f44068e13126c61a552962022-12-22T02:06:07ZengBMCBMC Biology1741-70072022-09-0120112610.1186/s12915-022-01398-wImproved biomarker discovery through a plot twist in transcriptomic data analysisNúria Sánchez-Baizán0Laia Ribas1Francesc Piferrer2Institut de Ciències del Mar (ICM), Spanish National Research Council (CSIC), BarcelonaInstitut de Ciències del Mar (ICM), Spanish National Research Council (CSIC), BarcelonaInstitut de Ciències del Mar (ICM), Spanish National Research Council (CSIC), BarcelonaAbstract Background Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. Results In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. Conclusions We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered.https://doi.org/10.1186/s12915-022-01398-wGene expression analysisGene networksWeighted gene co-expression network analysis (WGCNA)Sex determination and differentiationGonadal developmentBiomarker discovery
spellingShingle Núria Sánchez-Baizán
Laia Ribas
Francesc Piferrer
Improved biomarker discovery through a plot twist in transcriptomic data analysis
BMC Biology
Gene expression analysis
Gene networks
Weighted gene co-expression network analysis (WGCNA)
Sex determination and differentiation
Gonadal development
Biomarker discovery
title Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_full Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_fullStr Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_full_unstemmed Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_short Improved biomarker discovery through a plot twist in transcriptomic data analysis
title_sort improved biomarker discovery through a plot twist in transcriptomic data analysis
topic Gene expression analysis
Gene networks
Weighted gene co-expression network analysis (WGCNA)
Sex determination and differentiation
Gonadal development
Biomarker discovery
url https://doi.org/10.1186/s12915-022-01398-w
work_keys_str_mv AT nuriasanchezbaizan improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis
AT laiaribas improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis
AT francescpiferrer improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis