An Iterative Method for Identifying Essential Proteins Based on Non-Negative Matrix Factorization

In recent years, with the development of high-throughput technologies, lots of computational methods for predicting essential proteins based on protein-protein interaction (PPI) networks and biological information of proteins have been proposed successively. However, due to the incompleteness of PPI...

Full description

Bibliographic Details
Main Authors: Jin Liu, Xiangyi Wang, Zhiping Chen, Yihong Tan, Xueyong Li, Zhen Zhang, Lei Wang
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9301292/
Description
Summary:In recent years, with the development of high-throughput technologies, lots of computational methods for predicting essential proteins based on protein-protein interaction (PPI) networks and biological information of proteins have been proposed successively. However, due to the incompleteness of PPI networks, the prediction accuracy achieved by these methods is still unsatisfactory, and it remains to be a challenging work to design effective computational models to identify essential proteins. In this manuscript, a novel Prediction Model based on the Non-negative Matrix Factorization (PMNMF for abbreviation) is proposed. In PMNMF, an original PPI network will be constructed first based on PPIs downloaded from any given benchmark database. And then, based on topological features of protein nodes, the original PPI network will be further converted to a weighted PPI network. Moreover, in order to overcome the incompleteness of PPI networks, the NMF (Non-negative Matrix Factorization) method will be implemented on the weighted PPI network to obtain a transition probability matrix. And then, by integrating biological information including the gene expression information, homologous information and subcellular localization information of proteins, a unique initial score will be calculated and assigned to each protein node in the weighed PPI network, based on which, an improved Page-Rank algorithm will be designed to infer potential essential proteins. Finally, in order to evaluate the performance of PMNMF, it will be compared with 14 state-of-the-art prediction models, and experimental results show that PMNMF can achieve the best identification accuracy.
ISSN:2169-3536