Adaptive micro- and macro-knowledge incorporation for hierarchical text classification

Hierarchical text classification (HTC) aims to classify a text into multiple categories organized in a hierarchical structure. The state-of-the-art HTC methods usually employ graph networks, where label graphs are constructed and label representation is learned to interact with text representations...

Full description

Bibliographic Details
Main Authors: Feng, Zijian, Mao, Kezhi, Zhou, Hanzhang
Other Authors: School of Electrical and Electronic Engineering
Format: Journal Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175778
Description
Summary:Hierarchical text classification (HTC) aims to classify a text into multiple categories organized in a hierarchical structure. The state-of-the-art HTC methods usually employ graph networks, where label graphs are constructed and label representation is learned to interact with text representations for classification. In general, label graphs are built on the intrinsic label hierarchy, label semantic similarity, or label co-occurrence. Such graphs have been proven to be effective, but they only exploit knowledge from training data or simple label descriptions, without considering the vast external knowledge in the open sources. Actually, external knowledge from open sources could bring in complementary information to enhance the label graph's representation power. Motivated by the above considerations, we explore the use of external knowledge for improving HTC in this paper. We categorize knowledge into micro-knowledge and macro-knowledge, which are defined as the fundamental concepts related to a single class label and the correlations among class labels, respectively. For tailor-made incorporation of the two types of knowledge into representation learning and classification, we propose Adaptive Micro- and Macro-Knowledge Incorporation for Hierarchical Text Classification (AMKI-HTC) model in this paper. The micro-knowledge incorporation helps capture class-relevant keywords in the text and hence produce discriminative representations, while the macro-knowledge incorporation improves the accuracy of label graphs. Finally, a confidence maximization fusion strategy is developed for adaptive aggregation of multi-view features. Extensive experiments on three benchmark HTC datasets demonstrate that AMKI-HTC consistently outperforms state-of-the-art models.