ESA-GCN: An Enhanced Graph-Based Node Classification Method for Class Imbalance Using ENN-SMOTE Sampling and an Attention Mechanism

In recent years, graph neural networks (GNNs) have achieved great success in handling node classification tasks. However, as data explosively grows in various industries, the problem of class imbalance becomes increasingly severe. Traditional GNNs tend to prioritize majority class nodes when dealing...

Full description

Bibliographic Details
Main Authors: Liying Zhang, Haihang Sun
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/1/111
Description
Summary:In recent years, graph neural networks (GNNs) have achieved great success in handling node classification tasks. However, as data explosively grows in various industries, the problem of class imbalance becomes increasingly severe. Traditional GNNs tend to prioritize majority class nodes when dealing with imbalanced class distributions, which fail to adequately capture the features of minority class nodes, leading to significant difficulties and challenges in data classification. To address issues such as inaccurate edge generation during graph data oversampling, insufficient representation of minority classes, and the presence of noisy samples, this paper proposes the ESA-GCN model. The advantages of this model are as follows: (i) it employs the ENN-SMOTE comprehensive sampling method to balance the dataset by reducing majority class nodes and increasing minority class nodes; (ii) the ENN algorithm reduces the classifier’s error rate and improves performance stability by removing low-quality and noisy data; (iii) an attention mechanism is introduced during the edge generation phase between new nodes and original nodes, fully considering the mutual relationships between nodes and concentrating on a subset of key information with high weights, thereby significantly improving classification accuracy while reducing model parameters and computational complexity. Experiments conducted on three public datasets (Cora, Citeseer, and PubMed) demonstrate that the ESA-GCN has achieved significant results in class imbalance node classification tasks.
ISSN:2076-3417