Improved biomarker discovery through a plot twist in transcriptomic data analysis
Abstract Background Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-09-01
|
Series: | BMC Biology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12915-022-01398-w |
_version_ | 1818017091186327552 |
---|---|
author | Núria Sánchez-Baizán Laia Ribas Francesc Piferrer |
author_facet | Núria Sánchez-Baizán Laia Ribas Francesc Piferrer |
author_sort | Núria Sánchez-Baizán |
collection | DOAJ |
description | Abstract Background Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. Results In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. Conclusions We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered. |
first_indexed | 2024-04-14T07:22:19Z |
format | Article |
id | doaj.art-1b380dfaf83f44068e13126c61a55296 |
institution | Directory Open Access Journal |
issn | 1741-7007 |
language | English |
last_indexed | 2024-04-14T07:22:19Z |
publishDate | 2022-09-01 |
publisher | BMC |
record_format | Article |
series | BMC Biology |
spelling | doaj.art-1b380dfaf83f44068e13126c61a552962022-12-22T02:06:07ZengBMCBMC Biology1741-70072022-09-0120112610.1186/s12915-022-01398-wImproved biomarker discovery through a plot twist in transcriptomic data analysisNúria Sánchez-Baizán0Laia Ribas1Francesc Piferrer2Institut de Ciències del Mar (ICM), Spanish National Research Council (CSIC), BarcelonaInstitut de Ciències del Mar (ICM), Spanish National Research Council (CSIC), BarcelonaInstitut de Ciències del Mar (ICM), Spanish National Research Council (CSIC), BarcelonaAbstract Background Transcriptomic analysis is crucial for understanding the functional elements of the genome, with the classic method consisting of screening transcriptomics datasets for differentially expressed genes (DEGs). Additionally, since 2005, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful method to explore relationships between genes. However, an approach combining both methods, i.e., filtering the transcriptome dataset by DEGs or other criteria, followed by WGCNA (DEGs + WGCNA), has become common. This is of concern because such approach can affect the resulting underlying architecture of the network under analysis and lead to wrong conclusions. Here, we explore a plot twist to transcriptome data analysis: applying WGCNA to exploit entire datasets without affecting the topology of the network, followed with the strength and relative simplicity of DEG analysis (WGCNA + DEGs). We tested WGCNA + DEGs against DEGs + WGCNA to publicly available transcriptomics data in one of the most transcriptomically complex tissues and delicate processes: vertebrate gonads undergoing sex differentiation. We further validate the general applicability of our approach through analysis of datasets from three distinct model systems: European sea bass, mouse, and human. Results In all cases, WGCNA + DEGs clearly outperformed DEGs + WGCNA. First, the network model fit and node connectivity measures and other network statistics improved. The gene lists filtered by each method were different, the number of modules associated with the trait of interest and key genes retained increased, and GO terms of biological processes provided a more nuanced representation of the biological question under consideration. Lastly, WGCNA + DEGs facilitated biomarker discovery. Conclusions We propose that building a co-expression network from an entire dataset, and only thereafter filtering by DEGs, should be the method to use in transcriptomic studies, regardless of biological system, species, or question being considered.https://doi.org/10.1186/s12915-022-01398-wGene expression analysisGene networksWeighted gene co-expression network analysis (WGCNA)Sex determination and differentiationGonadal developmentBiomarker discovery |
spellingShingle | Núria Sánchez-Baizán Laia Ribas Francesc Piferrer Improved biomarker discovery through a plot twist in transcriptomic data analysis BMC Biology Gene expression analysis Gene networks Weighted gene co-expression network analysis (WGCNA) Sex determination and differentiation Gonadal development Biomarker discovery |
title | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_full | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_fullStr | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_full_unstemmed | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_short | Improved biomarker discovery through a plot twist in transcriptomic data analysis |
title_sort | improved biomarker discovery through a plot twist in transcriptomic data analysis |
topic | Gene expression analysis Gene networks Weighted gene co-expression network analysis (WGCNA) Sex determination and differentiation Gonadal development Biomarker discovery |
url | https://doi.org/10.1186/s12915-022-01398-w |
work_keys_str_mv | AT nuriasanchezbaizan improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis AT laiaribas improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis AT francescpiferrer improvedbiomarkerdiscoverythroughaplottwistintranscriptomicdataanalysis |