A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies
Abstract Background Microarrays can perform large scale studies of differential expressed gene (DEGs) and even single nucleotide polymorphisms (SNPs), thereby screening thousands of genes for single experiment simultaneously. However, DEGs and SNPs are still just as enigmatic as the first sequence o...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-09-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-022-04936-z |
_version_ | 1818024776905523200 |
---|---|
author | Giuseppe Agapito Marianna Milano Mario Cannataro |
author_facet | Giuseppe Agapito Marianna Milano Mario Cannataro |
author_sort | Giuseppe Agapito |
collection | DOAJ |
description | Abstract Background Microarrays can perform large scale studies of differential expressed gene (DEGs) and even single nucleotide polymorphisms (SNPs), thereby screening thousands of genes for single experiment simultaneously. However, DEGs and SNPs are still just as enigmatic as the first sequence of the genome. Because they are independent from the affected biological context. Pathway enrichment analysis (PEA) can overcome this obstacle by linking both DEGs and SNPs to the affected biological pathways and consequently to the underlying biological functions and processes. Results To improve the enrichment analysis results, we present a new statistical network pre-processing method by mapping DEGs and SNPs on a biological network that can improve the relevance and significance of the DEGs or SNPs of interest to incorporate pathway topology information into the PEA. The proposed methodology improves the statistical significance of the PEA analysis in terms of computed p value for each enriched pathways and limit the number of enriched pathways. This helps reduce the number of relevant biological pathways with respect to a non-specific list of genes. Conclusion The proposed method provides two-fold enhancements. Network analysis reveals fewer DEGs, by selecting only relevant DEGs and the detected DEGs improve the enriched pathways’ statistical significance, rather than simply using a general list of genes. |
first_indexed | 2024-12-10T04:05:36Z |
format | Article |
id | doaj.art-fdbed918f9db4609b8a53503325fb3e9 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-12-10T04:05:36Z |
publishDate | 2022-09-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-fdbed918f9db4609b8a53503325fb3e92022-12-22T02:02:51ZengBMCBMC Bioinformatics1471-21052022-09-0123S612010.1186/s12859-022-04936-zA statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studiesGiuseppe Agapito0Marianna Milano1Mario Cannataro2Department of Law, Economics and Sociology Sciences, University Magna GræciaData Analytics Research Center, University Magna GræciaData Analytics Research Center, University Magna GræciaAbstract Background Microarrays can perform large scale studies of differential expressed gene (DEGs) and even single nucleotide polymorphisms (SNPs), thereby screening thousands of genes for single experiment simultaneously. However, DEGs and SNPs are still just as enigmatic as the first sequence of the genome. Because they are independent from the affected biological context. Pathway enrichment analysis (PEA) can overcome this obstacle by linking both DEGs and SNPs to the affected biological pathways and consequently to the underlying biological functions and processes. Results To improve the enrichment analysis results, we present a new statistical network pre-processing method by mapping DEGs and SNPs on a biological network that can improve the relevance and significance of the DEGs or SNPs of interest to incorporate pathway topology information into the PEA. The proposed methodology improves the statistical significance of the PEA analysis in terms of computed p value for each enriched pathways and limit the number of enriched pathways. This helps reduce the number of relevant biological pathways with respect to a non-specific list of genes. Conclusion The proposed method provides two-fold enhancements. Network analysis reveals fewer DEGs, by selecting only relevant DEGs and the detected DEGs improve the enriched pathways’ statistical significance, rather than simply using a general list of genes.https://doi.org/10.1186/s12859-022-04936-zBiological pathwaysDifferential expressed genesPathway enrichment analysisStatistical analysisData mining networkNetwork analysis |
spellingShingle | Giuseppe Agapito Marianna Milano Mario Cannataro A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies BMC Bioinformatics Biological pathways Differential expressed genes Pathway enrichment analysis Statistical analysis Data mining network Network analysis |
title | A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies |
title_full | A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies |
title_fullStr | A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies |
title_full_unstemmed | A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies |
title_short | A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies |
title_sort | statistical network pre processing method to improve relevance and significance of gene lists in microarray gene expression studies |
topic | Biological pathways Differential expressed genes Pathway enrichment analysis Statistical analysis Data mining network Network analysis |
url | https://doi.org/10.1186/s12859-022-04936-z |
work_keys_str_mv | AT giuseppeagapito astatisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies AT mariannamilano astatisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies AT mariocannataro astatisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies AT giuseppeagapito statisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies AT mariannamilano statisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies AT mariocannataro statisticalnetworkpreprocessingmethodtoimproverelevanceandsignificanceofgenelistsinmicroarraygeneexpressionstudies |