Hierarchical Keyword Generation Method for Low-Resource Social Media Text

The exponential growth of social media text information presents a challenging issue in terms of retrieving valuable information efficiently. Utilizing deep learning models, we can automatically generate keywords that express core content and topics of social media text, thereby facilitating the ret...

Full description

Bibliographic Details
Main Authors: Xinyi Guan, Shun Long
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/14/11/615
_version_ 1797458947072851968
author Xinyi Guan
Shun Long
author_facet Xinyi Guan
Shun Long
author_sort Xinyi Guan
collection DOAJ
description The exponential growth of social media text information presents a challenging issue in terms of retrieving valuable information efficiently. Utilizing deep learning models, we can automatically generate keywords that express core content and topics of social media text, thereby facilitating the retrieval of critical information. However, the performance of deep learning models is limited by the labeled text data in the social media domain. To address this problem, this paper presents a hierarchical keyword generation method for low-resource social media text. Specifically, the text segment is introduced as a hierarchical unit of social media text to construct a hierarchical model structure and design a text segment recovery task for self-supervised training of the model, which not only improves the ability of the model to extract features from social media text, but also reduces the dependence of the keyword generation model on the labeled data in the social media domain. Experimental results from publicly available social media datasets demonstrate that the proposed method can effectively improve the keyword generation performance even given limited social media labeled data. Further discussions demonstrate that the self-supervised training stage based on the text segment recovery task indeed benefits the model in adapting to the social media text domain.
first_indexed 2024-03-09T16:44:27Z
format Article
id doaj.art-e4d90e4b8e324b4dbba81d7cf34b3d5f
institution Directory Open Access Journal
issn 2078-2489
language English
last_indexed 2024-03-09T16:44:27Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Information
spelling doaj.art-e4d90e4b8e324b4dbba81d7cf34b3d5f2023-11-24T14:48:17ZengMDPI AGInformation2078-24892023-11-01141161510.3390/info14110615Hierarchical Keyword Generation Method for Low-Resource Social Media TextXinyi Guan0Shun Long1Department of Computer Science, Jinan University, Guangzhou 510632, ChinaDepartment of Computer Science, Jinan University, Guangzhou 510632, ChinaThe exponential growth of social media text information presents a challenging issue in terms of retrieving valuable information efficiently. Utilizing deep learning models, we can automatically generate keywords that express core content and topics of social media text, thereby facilitating the retrieval of critical information. However, the performance of deep learning models is limited by the labeled text data in the social media domain. To address this problem, this paper presents a hierarchical keyword generation method for low-resource social media text. Specifically, the text segment is introduced as a hierarchical unit of social media text to construct a hierarchical model structure and design a text segment recovery task for self-supervised training of the model, which not only improves the ability of the model to extract features from social media text, but also reduces the dependence of the keyword generation model on the labeled data in the social media domain. Experimental results from publicly available social media datasets demonstrate that the proposed method can effectively improve the keyword generation performance even given limited social media labeled data. Further discussions demonstrate that the self-supervised training stage based on the text segment recovery task indeed benefits the model in adapting to the social media text domain.https://www.mdpi.com/2078-2489/14/11/615keyword generationsocial media texttransfer learningattention mechanism
spellingShingle Xinyi Guan
Shun Long
Hierarchical Keyword Generation Method for Low-Resource Social Media Text
Information
keyword generation
social media text
transfer learning
attention mechanism
title Hierarchical Keyword Generation Method for Low-Resource Social Media Text
title_full Hierarchical Keyword Generation Method for Low-Resource Social Media Text
title_fullStr Hierarchical Keyword Generation Method for Low-Resource Social Media Text
title_full_unstemmed Hierarchical Keyword Generation Method for Low-Resource Social Media Text
title_short Hierarchical Keyword Generation Method for Low-Resource Social Media Text
title_sort hierarchical keyword generation method for low resource social media text
topic keyword generation
social media text
transfer learning
attention mechanism
url https://www.mdpi.com/2078-2489/14/11/615
work_keys_str_mv AT xinyiguan hierarchicalkeywordgenerationmethodforlowresourcesocialmediatext
AT shunlong hierarchicalkeywordgenerationmethodforlowresourcesocialmediatext