An improved two-stage label propagation algorithm based on LeaderRank

Abstract To solve the problems of poor stability and low modularity (Q) of community division results caused by the randomness of node selection and label update in the traditional label propagation algorithm, an improved two-stage label propagation algorithm based on LeaderRank was proposed in this...

Full description

Bibliographic Details
Main Authors: Miaomiao Liu, Jinyun Yang, Jingfeng Guo, Jing Chen, Yongsheng Zhang
Format: Article
Language:English
Published: PeerJ Inc. 2022-05-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-981.pdf
_version_ 1811340162380070912
author Miaomiao Liu
Jinyun Yang
Jingfeng Guo
Jing Chen
Yongsheng Zhang
author_facet Miaomiao Liu
Jinyun Yang
Jingfeng Guo
Jing Chen
Yongsheng Zhang
author_sort Miaomiao Liu
collection DOAJ
description Abstract To solve the problems of poor stability and low modularity (Q) of community division results caused by the randomness of node selection and label update in the traditional label propagation algorithm, an improved two-stage label propagation algorithm based on LeaderRank was proposed in this study. In the first stage, the order of node updating was determined by the participation coefficient (PC). Then, a new similarity measure was defined to improve the label selection mechanism so as to solve the problem of label oscillation caused by multiple labels of the node with the most similarity to the node. Moreover, the influence of the nodes was comprehensively used to find the initial community structure. In the second stage, the rough communities obtained in the first stage were regarded as nodes, and their merging sequence was determined by the PC. Next, the non-weak community and the community with the largest number of connected edges were combined. Finally, the community structure was further optimized to improve the modularity so as to obtain the final partition result. Experiments were performed on nine classic realistic networks and 19 artificial datasets with different scales, complexities, and densities. The modularity and normalized mutual information (NMI) were used as evaluation indexes for comparing the improved algorithm with dozens of relevant classic algorithms. The results showed that the proposed algorithm yields superior performance, and the results of community partitioning obtained using the improved algorithm were stable and more accurate than those obtained using other algorithms. In addition, the proposed algorithm always performs well in nine large-scale artificial data sets with 6,000 to 50,000 nodes and three large realistic network datasets, which verifies its computational performance and utility in community detection for large-scale networks.
first_indexed 2024-04-13T18:37:58Z
format Article
id doaj.art-66617f5e3eb84d89a340c3476782ddec
institution Directory Open Access Journal
issn 2376-5992
language English
last_indexed 2024-04-13T18:37:58Z
publishDate 2022-05-01
publisher PeerJ Inc.
record_format Article
series PeerJ Computer Science
spelling doaj.art-66617f5e3eb84d89a340c3476782ddec2022-12-22T02:34:49ZengPeerJ Inc.PeerJ Computer Science2376-59922022-05-018e98110.7717/peerj-cs.981An improved two-stage label propagation algorithm based on LeaderRankMiaomiao Liu0Jinyun Yang1Jingfeng Guo2Jing Chen3Yongsheng Zhang4School of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang, ChinaSchool of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang, ChinaCollege of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei, ChinaCollege of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei, ChinaSchool of Computer and Information Technology, Northeast Petroleum University, Daqing, Heilongjiang, ChinaAbstract To solve the problems of poor stability and low modularity (Q) of community division results caused by the randomness of node selection and label update in the traditional label propagation algorithm, an improved two-stage label propagation algorithm based on LeaderRank was proposed in this study. In the first stage, the order of node updating was determined by the participation coefficient (PC). Then, a new similarity measure was defined to improve the label selection mechanism so as to solve the problem of label oscillation caused by multiple labels of the node with the most similarity to the node. Moreover, the influence of the nodes was comprehensively used to find the initial community structure. In the second stage, the rough communities obtained in the first stage were regarded as nodes, and their merging sequence was determined by the PC. Next, the non-weak community and the community with the largest number of connected edges were combined. Finally, the community structure was further optimized to improve the modularity so as to obtain the final partition result. Experiments were performed on nine classic realistic networks and 19 artificial datasets with different scales, complexities, and densities. The modularity and normalized mutual information (NMI) were used as evaluation indexes for comparing the improved algorithm with dozens of relevant classic algorithms. The results showed that the proposed algorithm yields superior performance, and the results of community partitioning obtained using the improved algorithm were stable and more accurate than those obtained using other algorithms. In addition, the proposed algorithm always performs well in nine large-scale artificial data sets with 6,000 to 50,000 nodes and three large realistic network datasets, which verifies its computational performance and utility in community detection for large-scale networks.https://peerj.com/articles/cs-981.pdfLabel propagationCommunity divisionLeaderRankNode influenceWeak communityModularity
spellingShingle Miaomiao Liu
Jinyun Yang
Jingfeng Guo
Jing Chen
Yongsheng Zhang
An improved two-stage label propagation algorithm based on LeaderRank
PeerJ Computer Science
Label propagation
Community division
LeaderRank
Node influence
Weak community
Modularity
title An improved two-stage label propagation algorithm based on LeaderRank
title_full An improved two-stage label propagation algorithm based on LeaderRank
title_fullStr An improved two-stage label propagation algorithm based on LeaderRank
title_full_unstemmed An improved two-stage label propagation algorithm based on LeaderRank
title_short An improved two-stage label propagation algorithm based on LeaderRank
title_sort improved two stage label propagation algorithm based on leaderrank
topic Label propagation
Community division
LeaderRank
Node influence
Weak community
Modularity
url https://peerj.com/articles/cs-981.pdf
work_keys_str_mv AT miaomiaoliu animprovedtwostagelabelpropagationalgorithmbasedonleaderrank
AT jinyunyang animprovedtwostagelabelpropagationalgorithmbasedonleaderrank
AT jingfengguo animprovedtwostagelabelpropagationalgorithmbasedonleaderrank
AT jingchen animprovedtwostagelabelpropagationalgorithmbasedonleaderrank
AT yongshengzhang animprovedtwostagelabelpropagationalgorithmbasedonleaderrank
AT miaomiaoliu improvedtwostagelabelpropagationalgorithmbasedonleaderrank
AT jinyunyang improvedtwostagelabelpropagationalgorithmbasedonleaderrank
AT jingfengguo improvedtwostagelabelpropagationalgorithmbasedonleaderrank
AT jingchen improvedtwostagelabelpropagationalgorithmbasedonleaderrank
AT yongshengzhang improvedtwostagelabelpropagationalgorithmbasedonleaderrank