Centrality Combination Method Based on Feature Selection for Protein Interaction Networks

Essential proteins are important participants in various life activities and play a vital role in the survival and reproduction of life. The network-based centrality methods are a common way to identify essential proteins for protein interaction networks. Due to the differences between the existing...

Full description

Bibliographic Details
Main Authors: Haoyue Wang, Li Pan, Jing Sun, Bin Li, Junqiang Jiang, Bo Yang, Wenbin Li
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9926101/
Description
Summary:Essential proteins are important participants in various life activities and play a vital role in the survival and reproduction of life. The network-based centrality methods are a common way to identify essential proteins for protein interaction networks. Due to the differences between the existing centrality methods, it is a feasible approach to improve the identification accuracy of essential proteins by combining centrality methods. In this paper, we propose a centrality combination method based on feature selection. First, the measure values of the 14 classical centrality methods are viewed as feature data. Then, a subset of the relevant features is selected according to the importance of features. Finally, the centrality methods corresponding to the selected features are combined by using the geometric mean method for the identification of essential proteins. To verify the effectiveness of the combination method, we apply the combination method on the original static protein interaction network (SPIN), the dynamic protein interaction network (DPIN) and the refined dynamic protein interaction network (RDPIN), and compare the result with those by each single centrality method (LAC, DC, DMNC, NC, TP, CLC, BC, LC, CC, KC, CR, EC, PR, LR). The experimental results on the identification of essential proteins shows that the combination method achieves better results in prediction performance than the 14 centrality mehtods in terms of the prediction precision, sensitivity, specificity, positive predictive value, negative predictive value, F-measure and accuracy rate. It has been illustrated that the proposed method can help to identify essential proteins more accurately.
ISSN:2169-3536