ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization
Abstract Background In cellular activities, essential proteins play a vital role and are instrumental in comprehending fundamental biological necessities and identifying pathogenic genes. Current deep learning approaches for predicting essential proteins underutilize the potential of gene expression...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2024-01-01
|
Series: | BMC Genomics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12864-024-10019-5 |
_version_ | 1827370020425957376 |
---|---|
author | Chen Ye Qi Wu Shuxia Chen Xuemei Zhang Wenwen Xu Yunzhi Wu Youhua Zhang Yi Yue |
author_facet | Chen Ye Qi Wu Shuxia Chen Xuemei Zhang Wenwen Xu Yunzhi Wu Youhua Zhang Yi Yue |
author_sort | Chen Ye |
collection | DOAJ |
description | Abstract Background In cellular activities, essential proteins play a vital role and are instrumental in comprehending fundamental biological necessities and identifying pathogenic genes. Current deep learning approaches for predicting essential proteins underutilize the potential of gene expression data and are inadequate for the exploration of dynamic networks with limited evaluation across diverse species. Results We introduce ECDEP, an essential protein identification model based on evolutionary community discovery. ECDEP integrates temporal gene expression data with a protein–protein interaction (PPI) network and employs the 3-Sigma rule to eliminate outliers at each time point, constructing a dynamic network. Next, we utilize edge birth and death information to establish an interaction streaming source to feed into the evolutionary community discovery algorithm and then identify overlapping communities during the evolution of the dynamic network. SVM recursive feature elimination (RFE) is applied to extract the most informative communities, which are combined with subcellular localization data for classification predictions. We assess the performance of ECDEP by comparing it against ten centrality methods, four shallow machine learning methods with RFE, and two deep learning methods that incorporate multiple biological data sources on Saccharomyces. Cerevisiae (S. cerevisiae), Homo sapiens (H. sapiens), Mus musculus, and Caenorhabditis elegans. ECDEP achieves an AP value of 0.86 on the H. sapiens dataset and the contribution ratio of community features in classification reaches 0.54 on the S. cerevisiae (Krogan) dataset. Conclusions Our proposed method adeptly integrates network dynamics and yields outstanding results across various datasets. Furthermore, the incorporation of evolutionary community discovery algorithms amplifies the capacity of gene expression data in classification. |
first_indexed | 2024-03-08T10:01:13Z |
format | Article |
id | doaj.art-d83859d6356b4607a6736b67fbcc3a72 |
institution | Directory Open Access Journal |
issn | 1471-2164 |
language | English |
last_indexed | 2024-03-08T10:01:13Z |
publishDate | 2024-01-01 |
publisher | BMC |
record_format | Article |
series | BMC Genomics |
spelling | doaj.art-d83859d6356b4607a6736b67fbcc3a722024-01-29T10:59:48ZengBMCBMC Genomics1471-21642024-01-0125112310.1186/s12864-024-10019-5ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localizationChen Ye0Qi Wu1Shuxia Chen2Xuemei Zhang3Wenwen Xu4Yunzhi Wu5Youhua Zhang6Yi Yue7School of Information and Artificial Intelligence, Anhui Agricultural UniversitySchool of Information and Artificial Intelligence, Anhui Agricultural UniversitySchool of Information and Artificial Intelligence, Anhui Agricultural UniversitySchool of Information and Artificial Intelligence, Anhui Agricultural UniversitySchool of Information and Artificial Intelligence, Anhui Agricultural UniversitySchool of Information and Artificial Intelligence, Anhui Agricultural UniversitySchool of Information and Artificial Intelligence, Anhui Agricultural UniversitySchool of Information and Artificial Intelligence, Anhui Agricultural UniversityAbstract Background In cellular activities, essential proteins play a vital role and are instrumental in comprehending fundamental biological necessities and identifying pathogenic genes. Current deep learning approaches for predicting essential proteins underutilize the potential of gene expression data and are inadequate for the exploration of dynamic networks with limited evaluation across diverse species. Results We introduce ECDEP, an essential protein identification model based on evolutionary community discovery. ECDEP integrates temporal gene expression data with a protein–protein interaction (PPI) network and employs the 3-Sigma rule to eliminate outliers at each time point, constructing a dynamic network. Next, we utilize edge birth and death information to establish an interaction streaming source to feed into the evolutionary community discovery algorithm and then identify overlapping communities during the evolution of the dynamic network. SVM recursive feature elimination (RFE) is applied to extract the most informative communities, which are combined with subcellular localization data for classification predictions. We assess the performance of ECDEP by comparing it against ten centrality methods, four shallow machine learning methods with RFE, and two deep learning methods that incorporate multiple biological data sources on Saccharomyces. Cerevisiae (S. cerevisiae), Homo sapiens (H. sapiens), Mus musculus, and Caenorhabditis elegans. ECDEP achieves an AP value of 0.86 on the H. sapiens dataset and the contribution ratio of community features in classification reaches 0.54 on the S. cerevisiae (Krogan) dataset. Conclusions Our proposed method adeptly integrates network dynamics and yields outstanding results across various datasets. Furthermore, the incorporation of evolutionary community discovery algorithms amplifies the capacity of gene expression data in classification.https://doi.org/10.1186/s12864-024-10019-5Essential proteinEvolutionary community discoveryProtein–protein interaction networkSubcellular localizationGene expression |
spellingShingle | Chen Ye Qi Wu Shuxia Chen Xuemei Zhang Wenwen Xu Yunzhi Wu Youhua Zhang Yi Yue ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization BMC Genomics Essential protein Evolutionary community discovery Protein–protein interaction network Subcellular localization Gene expression |
title | ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization |
title_full | ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization |
title_fullStr | ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization |
title_full_unstemmed | ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization |
title_short | ECDEP: identifying essential proteins based on evolutionary community discovery and subcellular localization |
title_sort | ecdep identifying essential proteins based on evolutionary community discovery and subcellular localization |
topic | Essential protein Evolutionary community discovery Protein–protein interaction network Subcellular localization Gene expression |
url | https://doi.org/10.1186/s12864-024-10019-5 |
work_keys_str_mv | AT chenye ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization AT qiwu ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization AT shuxiachen ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization AT xuemeizhang ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization AT wenwenxu ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization AT yunzhiwu ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization AT youhuazhang ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization AT yiyue ecdepidentifyingessentialproteinsbasedonevolutionarycommunitydiscoveryandsubcellularlocalization |