A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies

Abstract Background Microarrays can perform large scale studies of differential expressed gene (DEGs) and even single nucleotide polymorphisms (SNPs), thereby screening thousands of genes for single experiment simultaneously. However, DEGs and SNPs are still just as enigmatic as the first sequence o...

Full description

Bibliographic Details
Main Authors: Giuseppe Agapito, Marianna Milano, Mario Cannataro
Format: Article
Language:English
Published: BMC 2022-09-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-04936-z
_version_ 1818024776905523200
author Giuseppe Agapito
Marianna Milano
Mario Cannataro
author_facet Giuseppe Agapito
Marianna Milano
Mario Cannataro
author_sort Giuseppe Agapito
collection DOAJ
description Abstract Background Microarrays can perform large scale studies of differential expressed gene (DEGs) and even single nucleotide polymorphisms (SNPs), thereby screening thousands of genes for single experiment simultaneously. However, DEGs and SNPs are still just as enigmatic as the first sequence of the genome. Because they are independent from the affected biological context. Pathway enrichment analysis (PEA) can overcome this obstacle by linking both DEGs and SNPs to the affected biological pathways and consequently to the underlying biological functions and processes. Results To improve the enrichment analysis results, we present a new statistical network pre-processing method by mapping DEGs and SNPs on a biological network that can improve the relevance and significance of the DEGs or SNPs of interest to incorporate pathway topology information into the PEA. The proposed methodology improves the statistical significance of the PEA analysis in terms of computed p value for each enriched pathways and limit the number of enriched pathways. This helps reduce the number of relevant biological pathways with respect to a non-specific list of genes. Conclusion The proposed method provides two-fold enhancements. Network analysis reveals fewer DEGs, by selecting only relevant DEGs and the detected DEGs improve the enriched pathways’ statistical significance, rather than simply using a general list of genes.
first_indexed 2024-12-10T04:05:36Z
format Article
id doaj.art-fdbed918f9db4609b8a53503325fb3e9
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-10T04:05:36Z
publishDate 2022-09-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-fdbed918f9db4609b8a53503325fb3e92022-12-22T02:02:51ZengBMCBMC Bioinformatics1471-21052022-09-0123S612010.1186/s12859-022-04936-zA statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studiesGiuseppe Agapito0Marianna Milano1Mario Cannataro2Department of Law, Economics and Sociology Sciences, University Magna GræciaData Analytics Research Center, University Magna GræciaData Analytics Research Center, University Magna GræciaAbstract Background Microarrays can perform large scale studies of differential expressed gene (DEGs) and even single nucleotide polymorphisms (SNPs), thereby screening thousands of genes for single experiment simultaneously. However, DEGs and SNPs are still just as enigmatic as the first sequence of the genome. Because they are independent from the affected biological context. Pathway enrichment analysis (PEA) can overcome this obstacle by linking both DEGs and SNPs to the affected biological pathways and consequently to the underlying biological functions and processes. Results To improve the enrichment analysis results, we present a new statistical network pre-processing method by mapping DEGs and SNPs on a biological network that can improve the relevance and significance of the DEGs or SNPs of interest to incorporate pathway topology information into the PEA. The proposed methodology improves the statistical significance of the PEA analysis in terms of computed p value for each enriched pathways and limit the number of enriched pathways. This helps reduce the number of relevant biological pathways with respect to a non-specific list of genes. Conclusion The proposed method provides two-fold enhancements. Network analysis reveals fewer DEGs, by selecting only relevant DEGs and the detected DEGs improve the enriched pathways’ statistical significance, rather than simply using a general list of genes.https://doi.org/10.1186/s12859-022-04936-zBiological pathwaysDifferential expressed genesPathway enrichment analysisStatistical analysisData mining networkNetwork analysis
spellingShingle Giuseppe Agapito
Marianna Milano
Mario Cannataro
A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies
BMC Bioinformatics
Biological pathways
Differential expressed genes
Pathway enrichment analysis
Statistical analysis
Data mining network
Network analysis
title A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies
title_full A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies
title_fullStr A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies
title_full_unstemmed A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies
title_short A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies
title_sort statistical network pre processing method to improve relevance and significance of gene lists in microarray gene expression studies
topic Biological pathways
Differential expressed genes
Pathway enrichment analysis
Statistical analysis
Data mining network
Network analysis
url https://doi.org/10.1186/s12859-022-04936-z
work_keys_str_mv AT giuseppeagapito astatisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies
AT mariannamilano astatisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies
AT mariocannataro astatisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies
AT giuseppeagapito statisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies
AT mariannamilano statisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies
AT mariocannataro statisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies