An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information

High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably pr...

Full description

Bibliographic Details
Main Authors:	Zhihong Zhang, Yingchun Luo, Meiping Jiang, Dongjie Wu, Wang Zhang, Wei Yan, Bihai Zhao
Format:	Article
Language:	English
Published:	AIMS Press 2022-04-01
Series:	Mathematical Biosciences and Engineering
Subjects:	essential protein protein-protein interaction non-negative matrix symmetric tri-factorization multiple biological information subcellular location information homology information
Online Access:	https://www.aimspress.com/article/doi/10.3934/mbe.2022296?viewType=HTML

_version_	1817981327142551552
author	Zhihong Zhang Yingchun Luo Meiping Jiang Dongjie Wu Wang Zhang Wei Yan Bihai Zhao
author_facet	Zhihong Zhang Yingchun Luo Meiping Jiang Dongjie Wu Wang Zhang Wei Yan Bihai Zhao
author_sort	Zhihong Zhang
collection	DOAJ
description	High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably produces false positive and false negative data, such as the noise in the Protein-Protein Interaction (PPI) networks and the noise generated by the integration of a variety of biological information. How to solve these noise problems is the key role in essential protein predictions. An Identifying Essential Proteins model based on non-negative Matrix Symmetric tri-Factorization and multiple biological information (IEPMSF) is proposed in this paper, which utilizes only the PPI network proteins common neighbor characters to develop a weighted network, and uses the non-negative matrix symmetric tri-factorization method to find more potential interactions between proteins in the network so as to optimize the weighted network. Then, using the subcellular location and lineal homology information, the starting score of proteins is determined, and the random walk algorithm with restart mode is applied to the optimized network to mark and rank each protein. We tested the suggested forecasting model against current representative approaches using a public database. Experiment shows high efficiency of new method in essential proteins identification. The effectiveness of this method shows that it can dramatically solve the noise problems that existing in the multi-source biological information itself and cased by integrating them.
first_indexed	2024-04-13T23:04:27Z
format	Article
id	doaj.art-cfa69b86f3ab4e5fbc0f18e214788eb8
institution	Directory Open Access Journal
issn	1551-0018
language	English
last_indexed	2024-04-13T23:04:27Z
publishDate	2022-04-01
publisher	AIMS Press
record_format	Article
series	Mathematical Biosciences and Engineering
spelling	doaj.art-cfa69b86f3ab4e5fbc0f18e214788eb82022-12-22T02:25:43ZengAIMS PressMathematical Biosciences and Engineering1551-00182022-04-011966331634310.3934/mbe.2022296An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction informationZhihong Zhang 0Yingchun Luo1Meiping Jiang 2Dongjie Wu3Wang Zhang4Wei Yan5Bihai Zhao61. College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China2. Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan 410008, China2. Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan 410008, China3. Department of Banking and Finance, Monash University, Clayton, Victoria 3168, Australia4. Department of Optoelectronic Engineering, Jinan University, Guangzhou, Guangdong 510632, China1. College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China1. College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, ChinaHigh throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably produces false positive and false negative data, such as the noise in the Protein-Protein Interaction (PPI) networks and the noise generated by the integration of a variety of biological information. How to solve these noise problems is the key role in essential protein predictions. An Identifying Essential Proteins model based on non-negative Matrix Symmetric tri-Factorization and multiple biological information (IEPMSF) is proposed in this paper, which utilizes only the PPI network proteins common neighbor characters to develop a weighted network, and uses the non-negative matrix symmetric tri-factorization method to find more potential interactions between proteins in the network so as to optimize the weighted network. Then, using the subcellular location and lineal homology information, the starting score of proteins is determined, and the random walk algorithm with restart mode is applied to the optimized network to mark and rank each protein. We tested the suggested forecasting model against current representative approaches using a public database. Experiment shows high efficiency of new method in essential proteins identification. The effectiveness of this method shows that it can dramatically solve the noise problems that existing in the multi-source biological information itself and cased by integrating them.https://www.aimspress.com/article/doi/10.3934/mbe.2022296?viewType=HTMLessential proteinprotein-protein interactionnon-negative matrix symmetric tri-factorizationmultiple biological informationsubcellular location informationhomology information
spellingShingle	Zhihong Zhang Yingchun Luo Meiping Jiang Dongjie Wu Wang Zhang Wei Yan Bihai Zhao An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information Mathematical Biosciences and Engineering essential protein protein-protein interaction non-negative matrix symmetric tri-factorization multiple biological information subcellular location information homology information
title	An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information
title_full	An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information
title_fullStr	An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information
title_full_unstemmed	An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information
title_short	An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information
title_sort	efficient strategy for identifying essential proteins based on homology subcellular location and protein protein interaction information
topic	essential protein protein-protein interaction non-negative matrix symmetric tri-factorization multiple biological information subcellular location information homology information
url	https://www.aimspress.com/article/doi/10.3934/mbe.2022296?viewType=HTML
work_keys_str_mv	AT zhihongzhang anefficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT yingchunluo anefficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT meipingjiang anefficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT dongjiewu anefficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT wangzhang anefficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT weiyan anefficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT bihaizhao anefficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT zhihongzhang efficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT yingchunluo efficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT meipingjiang efficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT dongjiewu efficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT wangzhang efficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT weiyan efficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation AT bihaizhao efficientstrategyforidentifyingessentialproteinsbasedonhomologysubcellularlocationandproteinproteininteractioninformation

An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information

Similar Items