A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing values
Abstract Background Gene co-expression networks (GCNs) can be used to determine gene regulation and attribute gene function to biological processes. Different high throughput technologies, including one and two-channel microarrays and RNA-sequencing, allow evaluating thousands of gene expression dat...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-05-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-022-04697-9 |
_version_ | 1811235512419090432 |
---|---|
author | Junyao Kuang Nicolas Buchon Kristin Michel Caterina Scoglio |
author_facet | Junyao Kuang Nicolas Buchon Kristin Michel Caterina Scoglio |
author_sort | Junyao Kuang |
collection | DOAJ |
description | Abstract Background Gene co-expression networks (GCNs) can be used to determine gene regulation and attribute gene function to biological processes. Different high throughput technologies, including one and two-channel microarrays and RNA-sequencing, allow evaluating thousands of gene expression data simultaneously, but these methodologies provide results that cannot be directly compared. Thus, it is complex to analyze co-expression relations between genes, especially when there are missing values arising for experimental reasons. Networks are a helpful tool for studying gene co-expression, where nodes represent genes and edges represent co-expression of pairs of genes. Results In this paper, we establish a method for constructing a gene co-expression network for the Anopheles gambiae transcriptome from 257 unique studies obtained with different methodologies and experimental designs. We introduce the sliding threshold approach to select node pairs with high Pearson correlation coefficients. The resulting network, which we name AgGCN1.0, is robust to random removal of conditions and has similar characteristics to small-world and scale-free networks. Analysis of network sub-graphs revealed that the core is largely comprised of genes that encode components of the mitochondrial respiratory chain and the ribosome, while different communities are enriched for genes involved in distinct biological processes. Conclusion Analysis of the network reveals that both the architecture of the core sub-network and the network communities are based on gene function, supporting the power of the proposed method for GCN construction. Application of network science methodology reveals that the overall network structure is driven to maximize the integration of essential cellular functions, possibly allowing the flexibility to add novel functions. |
first_indexed | 2024-04-12T11:52:05Z |
format | Article |
id | doaj.art-e28a66fc8fec42b1b7438162a75cbb50 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-04-12T11:52:05Z |
publishDate | 2022-05-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-e28a66fc8fec42b1b7438162a75cbb502022-12-22T03:34:08ZengBMCBMC Bioinformatics1471-21052022-05-0123112710.1186/s12859-022-04697-9A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing valuesJunyao Kuang0Nicolas Buchon1Kristin Michel2Caterina Scoglio3Department of Electrical and Computer Engineering, Kansas State UniversityDepartment of Entomology, Cornell Institute of Host-Microbe Interactions and Disease, Cornell UniversityDivision of Biology, Kansas State UniversityDepartment of Electrical and Computer Engineering, Kansas State UniversityAbstract Background Gene co-expression networks (GCNs) can be used to determine gene regulation and attribute gene function to biological processes. Different high throughput technologies, including one and two-channel microarrays and RNA-sequencing, allow evaluating thousands of gene expression data simultaneously, but these methodologies provide results that cannot be directly compared. Thus, it is complex to analyze co-expression relations between genes, especially when there are missing values arising for experimental reasons. Networks are a helpful tool for studying gene co-expression, where nodes represent genes and edges represent co-expression of pairs of genes. Results In this paper, we establish a method for constructing a gene co-expression network for the Anopheles gambiae transcriptome from 257 unique studies obtained with different methodologies and experimental designs. We introduce the sliding threshold approach to select node pairs with high Pearson correlation coefficients. The resulting network, which we name AgGCN1.0, is robust to random removal of conditions and has similar characteristics to small-world and scale-free networks. Analysis of network sub-graphs revealed that the core is largely comprised of genes that encode components of the mitochondrial respiratory chain and the ribosome, while different communities are enriched for genes involved in distinct biological processes. Conclusion Analysis of the network reveals that both the architecture of the core sub-network and the network communities are based on gene function, supporting the power of the proposed method for GCN construction. Application of network science methodology reveals that the overall network structure is driven to maximize the integration of essential cellular functions, possibly allowing the flexibility to add novel functions.https://doi.org/10.1186/s12859-022-04697-9Anopheles gambiaeCo-expression networkMissing valueCorrelation |
spellingShingle | Junyao Kuang Nicolas Buchon Kristin Michel Caterina Scoglio A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing values BMC Bioinformatics Anopheles gambiae Co-expression network Missing value Correlation |
title | A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing values |
title_full | A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing values |
title_fullStr | A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing values |
title_full_unstemmed | A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing values |
title_short | A global $$Anopheles\ gambiae$$ A n o p h e l e s g a m b i a e gene co-expression network constructed from hundreds of experimental conditions with missing values |
title_sort | global anopheles gambiae a n o p h e l e s g a m b i a e gene co expression network constructed from hundreds of experimental conditions with missing values |
topic | Anopheles gambiae Co-expression network Missing value Correlation |
url | https://doi.org/10.1186/s12859-022-04697-9 |
work_keys_str_mv | AT junyaokuang aglobalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues AT nicolasbuchon aglobalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues AT kristinmichel aglobalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues AT caterinascoglio aglobalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues AT junyaokuang globalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues AT nicolasbuchon globalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues AT kristinmichel globalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues AT caterinascoglio globalanophelesgambiaeanophelesgambiaegenecoexpressionnetworkconstructedfromhundredsofexperimentalconditionswithmissingvalues |