Heterogeneous Network Embedding Based on Random Walks of Type and Inner Constraint

In heterogeneous networks, random walks based on meta-paths require prior knowledge and lack flexibility. On the other hand, random walks based on non-meta-paths only consider the number of node types, but not the influence of schema and topology between node types in real networks. To solve these p...

Full description

Bibliographic Details
Main Authors: Xiao Chen, Tong Hao, Li Han, Meng Leng, Jing Chen, Jingfeng Guo
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/10/15/2623
Description
Summary:In heterogeneous networks, random walks based on meta-paths require prior knowledge and lack flexibility. On the other hand, random walks based on non-meta-paths only consider the number of node types, but not the influence of schema and topology between node types in real networks. To solve these problems, this paper proposes a novel model HNE-RWTIC (Heterogeneous Network Embedding Based on Random Walks of Type and Inner Constraint). Firstly, to realize flexible walks, we design a Type strategy, which is a node type selection strategy based on the co-occurrence probability of node types. Secondly, to achieve the uniformity of node sampling, we design an Inner strategy, which is a node selection strategy based on the adjacency relationship between nodes. The Type and Inner strategy can realize the random walks based on meta-paths, the flexibility of the walks, and can sample the node types and nodes uniformly in proportion. Thirdly, based on the above strategy, a transition probability model is constructed; then, we obtain the nodes’ embedding based on the random walks and Skip-Gram. Finally, in the classification and clustering tasks, we conducted a thorough empirical evaluation of our method on three real heterogeneous networks. Experimental results show that HNE-RWTIC outperforms state-of-the-art approaches. In the classification task, in DBLP, AMiner-Top, and Yelp, the values of Micro-F1 and Macro-F1 of HNE-RWTIC are the highest: 2.25% and 2.43%, 0.85% and 0.99%, 3.77% and 5.02% higher than those of five other algorithms, respectively. In the clustering task, in DBLP, AMiner-Top, and Yelp networks, the NMI value is increased by 19.12%, 6.91%, and 0.04% at most, respectively.
ISSN:2227-7390