Heterogeneous Graph Purification Network: Purifying Noisy Heterogeneity without Metapaths

Heterogeneous graph neural networks (HGNNs) deliver the powerful capability to model many complex systems in real-world scenarios by embedding rich structural and semantic information of a heterogeneous graph into low-dimensional representations. However, existing HGNNs encounter great difficulty in...

Full description

Bibliographic Details
Main Authors: Sirui Shen, Daobin Zhang, Shuchao Li, Pengcheng Dong, Qing Liu, Xiaoyu Li, Zequn Zhang
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/6/3989
Description
Summary:Heterogeneous graph neural networks (HGNNs) deliver the powerful capability to model many complex systems in real-world scenarios by embedding rich structural and semantic information of a heterogeneous graph into low-dimensional representations. However, existing HGNNs encounter great difficulty in balancing the ability to avoid artificial metapaths with resisting structural and informational noise in a heterogeneous graph. In this paper, we propose a novel framework called <b>H</b>eterogeneous <b>G</b>raph <b>P</b>urification <b>N</b>etwork (HGPN) which aims to solve such dilemma by adaptively purifying the noisy heterogeneity. Specifically, instead of relying on artificial metapaths, HGPN models heterogeneity by subgraph decomposition and adopts inter-subgraph and intra-subgraph aggregation methods. HGPN can learn to purify noisy edges based on semantic information with a parallel heterogeneous structure purification mechanism. Besides, we design a neighborhood-related dynamic residual update method, a type-specific normalization module and cluster-aware loss to help all types of node achieve high-quality representations and maintain feature distribution while preventing feature over-mixing problems. Extensive experiments are conducted on four common heterogeneous graph datasets, and results show that our approach outperforms all existing methods and achieves state-of-the-art performances consistently among all the datasets.
ISSN:2076-3417