Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation

Summary generation is an important research direction in natural language processing. Aimed at the problems of redundant information processing difficulties and an inability to generate high-quality summaries from long text in existing summary generation models, BART is the backbone model, an <i&...

Full description

Bibliographic Details
Main Authors:	Di Wu, Peng Cheng, Yuying Zheng
Format:	Article
Language:	English
Published:	MDPI AG 2024-03-01
Series:	Applied Sciences
Subjects:	summary generation multistage mixed-attention unsupervised keyword extraction
Online Access:	https://www.mdpi.com/2076-3417/14/6/2435

_version_	1827307141708382208
author	Di Wu Peng Cheng Yuying Zheng
author_facet	Di Wu Peng Cheng Yuying Zheng
author_sort	Di Wu
collection	DOAJ
description	Summary generation is an important research direction in natural language processing. Aimed at the problems of redundant information processing difficulties and an inability to generate high-quality summaries from long text in existing summary generation models, BART is the backbone model, an <i>N</i> + 1 coarse–fine-grained multistage summary generation framework is constructed, and a multistage mixed-attention unsupervised keyword extraction summary generation model is proposed (multistage mixed-attention unsupervised keyword extraction for summary generation, MSMAUKE-S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i>). In the <i>N</i>-coarse-grained summary generation stages, a sentence filtering layer (PureText) is constructed to remove redundant information in long text. A mixed-attention unsupervised approach is used to iteratively extract keywords, assisting summary inference and enriching the global semantic information of coarse-grained summaries. In the 1-fine-grained summary generation stage, a self-attentive keyword selection module (KeywordSelect) is designed to obtain keywords with higher weights and enhance the local semantic representation of fine-grained summaries. Tandem <i>N</i>-coarse-grained and 1-fine-grained summary generation stages are used to obtain long text summaries through a multistage generation approach. The experimental results show that the model improves the ROUGE-1, ROUGE-2, and ROUGE-L metrics by a minimum of 0.75%, 1.48%, and 1.25% over the HMNET, TextRank, HAT-BART, DDAMS, and S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i> models on summarized datasets such as AMI, ICSI, and QMSum.
first_indexed	2024-04-24T18:35:25Z
format	Article
id	doaj.art-3f3ac65d379c49e79d46dc3d77b42ddf
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-04-24T18:35:25Z
publishDate	2024-03-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-3f3ac65d379c49e79d46dc3d77b42ddf2024-03-27T13:19:41ZengMDPI AGApplied Sciences2076-34172024-03-01146243510.3390/app14062435Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary GenerationDi Wu0Peng Cheng1Yuying Zheng2School of Information and Electronic Engineering, Hebei University of Engineering, No. 19 Taiji Road, Handan 056000, ChinaSchool of Information and Electronic Engineering, Hebei University of Engineering, No. 19 Taiji Road, Handan 056000, ChinaSchool of Information and Electronic Engineering, Hebei University of Engineering, No. 19 Taiji Road, Handan 056000, ChinaSummary generation is an important research direction in natural language processing. Aimed at the problems of redundant information processing difficulties and an inability to generate high-quality summaries from long text in existing summary generation models, BART is the backbone model, an <i>N</i> + 1 coarse–fine-grained multistage summary generation framework is constructed, and a multistage mixed-attention unsupervised keyword extraction summary generation model is proposed (multistage mixed-attention unsupervised keyword extraction for summary generation, MSMAUKE-S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i>). In the <i>N</i>-coarse-grained summary generation stages, a sentence filtering layer (PureText) is constructed to remove redundant information in long text. A mixed-attention unsupervised approach is used to iteratively extract keywords, assisting summary inference and enriching the global semantic information of coarse-grained summaries. In the 1-fine-grained summary generation stage, a self-attentive keyword selection module (KeywordSelect) is designed to obtain keywords with higher weights and enhance the local semantic representation of fine-grained summaries. Tandem <i>N</i>-coarse-grained and 1-fine-grained summary generation stages are used to obtain long text summaries through a multistage generation approach. The experimental results show that the model improves the ROUGE-1, ROUGE-2, and ROUGE-L metrics by a minimum of 0.75%, 1.48%, and 1.25% over the HMNET, TextRank, HAT-BART, DDAMS, and S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i> models on summarized datasets such as AMI, ICSI, and QMSum.https://www.mdpi.com/2076-3417/14/6/2435summary generationmultistagemixed-attentionunsupervisedkeyword extraction
spellingShingle	Di Wu Peng Cheng Yuying Zheng Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation Applied Sciences summary generation multistage mixed-attention unsupervised keyword extraction
title	Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation
title_full	Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation
title_fullStr	Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation
title_full_unstemmed	Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation
title_short	Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation
title_sort	multistage mixed attention unsupervised keyword extraction for summary generation
topic	summary generation multistage mixed-attention unsupervised keyword extraction
url	https://www.mdpi.com/2076-3417/14/6/2435
work_keys_str_mv	AT diwu multistagemixedattentionunsupervisedkeywordextractionforsummarygeneration AT pengcheng multistagemixedattentionunsupervisedkeywordextractionforsummarygeneration AT yuyingzheng multistagemixedattentionunsupervisedkeywordextractionforsummarygeneration

Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation

Similar Items