Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism

Hierarchical multi-label text classification (HMTC) is a highly relevant and widely discussed topic in the era of big data, particularly for efficiently classifying extensive amounts of text data. This study proposes the HTMC-PGT framework for poverty governance’s single-path hierarchical multi-labe...

Full description

Bibliographic Details
Main Authors: Xin Wang, Leifeng Guo
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/13/7377
_version_ 1797592195231907840
author Xin Wang
Leifeng Guo
author_facet Xin Wang
Leifeng Guo
author_sort Xin Wang
collection DOAJ
description Hierarchical multi-label text classification (HMTC) is a highly relevant and widely discussed topic in the era of big data, particularly for efficiently classifying extensive amounts of text data. This study proposes the HTMC-PGT framework for poverty governance’s single-path hierarchical multi-label classification problem. The framework simplifies the HMTC problem into training and combination problems of multi-class classifiers in the classifier tree. Each independent classifier in this framework uses an XLNet pretrained model to extract char-level semantic embeddings of text and employs a hierarchical attention mechanism integrated with Bi-LSTM (BiLSTM + HA) to extract semantic embeddings at the document level for classification purposes. Simultaneously, this study proposes that the structure uses transfer learning (TL) between classifiers in the classifier tree. The experimental results show that the proposed XLNet + BiLSTM + HA + FC + TL model achieves micro-P, micro-R, and micro-F1 values of 96.1%, which is 7.5~38.1% higher than those of other baseline models. The HTMC-PGT framework based on XLNet, BiLSTM + HA, and transfer learning (TL) between classifier tree nodes proposed in this study solves the hierarchical multi-label classification problem of poverty governance text (PGT). It provides a new idea for solving the traditional HMTC problem.
first_indexed 2024-03-11T01:48:01Z
format Article
id doaj.art-56154e70f56f4f73a526f3cfd57e82f4
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T01:48:01Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-56154e70f56f4f73a526f3cfd57e82f42023-11-18T16:05:32ZengMDPI AGApplied Sciences2076-34172023-06-011313737710.3390/app13137377Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention MechanismXin Wang0Leifeng Guo1Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 100081, ChinaAgricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 100081, ChinaHierarchical multi-label text classification (HMTC) is a highly relevant and widely discussed topic in the era of big data, particularly for efficiently classifying extensive amounts of text data. This study proposes the HTMC-PGT framework for poverty governance’s single-path hierarchical multi-label classification problem. The framework simplifies the HMTC problem into training and combination problems of multi-class classifiers in the classifier tree. Each independent classifier in this framework uses an XLNet pretrained model to extract char-level semantic embeddings of text and employs a hierarchical attention mechanism integrated with Bi-LSTM (BiLSTM + HA) to extract semantic embeddings at the document level for classification purposes. Simultaneously, this study proposes that the structure uses transfer learning (TL) between classifiers in the classifier tree. The experimental results show that the proposed XLNet + BiLSTM + HA + FC + TL model achieves micro-P, micro-R, and micro-F1 values of 96.1%, which is 7.5~38.1% higher than those of other baseline models. The HTMC-PGT framework based on XLNet, BiLSTM + HA, and transfer learning (TL) between classifier tree nodes proposed in this study solves the hierarchical multi-label classification problem of poverty governance text (PGT). It provides a new idea for solving the traditional HMTC problem.https://www.mdpi.com/2076-3417/13/13/7377HMTCXLNethierarchical attention mechanismBi-LSTMtransfer learningrural poverty governance
spellingShingle Xin Wang
Leifeng Guo
Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism
Applied Sciences
HMTC
XLNet
hierarchical attention mechanism
Bi-LSTM
transfer learning
rural poverty governance
title Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism
title_full Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism
title_fullStr Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism
title_full_unstemmed Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism
title_short Multi-Label Classification of Chinese Rural Poverty Governance Texts Based on XLNet and Bi-LSTM Fused Hierarchical Attention Mechanism
title_sort multi label classification of chinese rural poverty governance texts based on xlnet and bi lstm fused hierarchical attention mechanism
topic HMTC
XLNet
hierarchical attention mechanism
Bi-LSTM
transfer learning
rural poverty governance
url https://www.mdpi.com/2076-3417/13/13/7377
work_keys_str_mv AT xinwang multilabelclassificationofchineseruralpovertygovernancetextsbasedonxlnetandbilstmfusedhierarchicalattentionmechanism
AT leifengguo multilabelclassificationofchineseruralpovertygovernancetextsbasedonxlnetandbilstmfusedhierarchicalattentionmechanism