An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization

In recent years, with the development of high-throughput technologies, lots of computational methods for predicting essential proteins based on protein-protein interaction (PPI) networks and biological information of proteins have been proposed successively. However, due to the incompleteness of PPI...

Full description

Bibliographic Details
Main Authors: Jin Liu, Xiangyi Wang, Zhiping Chen, Yihong Tan, Xueyong Li, Zhen Zhang, Lei Wang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9301292/
_version_ 1828112023070703616
author Jin Liu
Xiangyi Wang
Zhiping Chen
Yihong Tan
Xueyong Li
Zhen Zhang
Lei Wang
author_facet Jin Liu
Xiangyi Wang
Zhiping Chen
Yihong Tan
Xueyong Li
Zhen Zhang
Lei Wang
author_sort Jin Liu
collection DOAJ
description In recent years, with the development of high-throughput technologies, lots of computational methods for predicting essential proteins based on protein-protein interaction (PPI) networks and biological information of proteins have been proposed successively. However, due to the incompleteness of PPI networks, the prediction accuracy achieved by these methods is still unsatisfactory, and it remains to be a challenging work to design effective computational models to identify essential proteins. In this manuscript, a novel Prediction Model based on the Non-negative Matrix Factorization (PMNMF for abbreviation) is proposed. In PMNMF, an original PPI network will be constructed first based on PPIs downloaded from any given benchmark database. And then, based on topological features of protein nodes, the original PPI network will be further converted to a weighted PPI network. Moreover, in order to overcome the incompleteness of PPI networks, the NMF (Non-negative Matrix Factorization) method will be implemented on the weighted PPI network to obtain a transition probability matrix. And then, by integrating biological information including the gene expression information, homologous information and subcellular localization information of proteins, a unique initial score will be calculated and assigned to each protein node in the weighed PPI network, based on which, an improved Page-Rank algorithm will be designed to infer potential essential proteins. Finally, in order to evaluate the performance of PMNMF, it will be compared with 14 state-of-the-art prediction models, and experimental results show that PMNMF can achieve the best identification accuracy.
first_indexed 2024-04-11T11:44:10Z
format Article
id doaj.art-2a3c3cd99ffa4259a5972df21ea17ac7
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T11:44:10Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-2a3c3cd99ffa4259a5972df21ea17ac72022-12-22T04:25:42ZengIEEEIEEE Access2169-35362020-01-01822668522669610.1109/ACCESS.2020.30462549301292An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix FactorizationJin Liu0https://orcid.org/0000-0002-5768-3442Xiangyi Wang1https://orcid.org/0000-0001-8634-3917Zhiping Chen2https://orcid.org/0000-0003-4759-3774Yihong Tan3https://orcid.org/0000-0001-7619-8090Xueyong Li4https://orcid.org/0000-0002-9105-1764Zhen Zhang5https://orcid.org/0000-0001-9629-9614Lei Wang6https://orcid.org/0000-0002-5065-8447College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, ChinaCollege of Computer Engineering and Applied Mathematics, Changsha University, Changsha, ChinaCollege of Computer Engineering and Applied Mathematics, Changsha University, Changsha, ChinaCollege of Computer Engineering and Applied Mathematics, Changsha University, Changsha, ChinaCollege of Computer Engineering and Applied Mathematics, Changsha University, Changsha, ChinaCollege of Computer Engineering and Applied Mathematics, Changsha University, Changsha, ChinaCollege of Computer Engineering and Applied Mathematics, Changsha University, Changsha, ChinaIn recent years, with the development of high-throughput technologies, lots of computational methods for predicting essential proteins based on protein-protein interaction (PPI) networks and biological information of proteins have been proposed successively. However, due to the incompleteness of PPI networks, the prediction accuracy achieved by these methods is still unsatisfactory, and it remains to be a challenging work to design effective computational models to identify essential proteins. In this manuscript, a novel Prediction Model based on the Non-negative Matrix Factorization (PMNMF for abbreviation) is proposed. In PMNMF, an original PPI network will be constructed first based on PPIs downloaded from any given benchmark database. And then, based on topological features of protein nodes, the original PPI network will be further converted to a weighted PPI network. Moreover, in order to overcome the incompleteness of PPI networks, the NMF (Non-negative Matrix Factorization) method will be implemented on the weighted PPI network to obtain a transition probability matrix. And then, by integrating biological information including the gene expression information, homologous information and subcellular localization information of proteins, a unique initial score will be calculated and assigned to each protein node in the weighed PPI network, based on which, an improved Page-Rank algorithm will be designed to infer potential essential proteins. Finally, in order to evaluate the performance of PMNMF, it will be compared with 14 state-of-the-art prediction models, and experimental results show that PMNMF can achieve the best identification accuracy.https://ieeexplore.ieee.org/document/9301292/Essential protein predictioniteration methodnon-negative matrix factorization
spellingShingle Jin Liu
Xiangyi Wang
Zhiping Chen
Yihong Tan
Xueyong Li
Zhen Zhang
Lei Wang
An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization
IEEE Access
Essential protein prediction
iteration method
non-negative matrix factorization
title An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization
title_full An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization
title_fullStr An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization
title_full_unstemmed An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization
title_short An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization
title_sort iterative method for identifying essential proteins based on non negative matrix factorization
topic Essential protein prediction
iteration method
non-negative matrix factorization
url https://ieeexplore.ieee.org/document/9301292/
work_keys_str_mv AT jinliu aniterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT xiangyiwang aniterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT zhipingchen aniterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT yihongtan aniterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT xueyongli aniterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT zhenzhang aniterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT leiwang aniterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT jinliu iterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT xiangyiwang iterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT zhipingchen iterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT yihongtan iterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT xueyongli iterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT zhenzhang iterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization
AT leiwang iterativemethodforidentifyingessentialproteinsbasedonnonnegativematrixfactorization