Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation
Summary generation is an important research direction in natural language processing. Aimed at the problems of redundant information processing difficulties and an inability to generate high-quality summaries from long text in existing summary generation models, BART is the backbone model, an <i&...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-03-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/14/6/2435 |
_version_ | 1797242198782115840 |
---|---|
author | Di Wu Peng Cheng Yuying Zheng |
author_facet | Di Wu Peng Cheng Yuying Zheng |
author_sort | Di Wu |
collection | DOAJ |
description | Summary generation is an important research direction in natural language processing. Aimed at the problems of redundant information processing difficulties and an inability to generate high-quality summaries from long text in existing summary generation models, BART is the backbone model, an <i>N</i> + 1 coarse–fine-grained multistage summary generation framework is constructed, and a multistage mixed-attention unsupervised keyword extraction summary generation model is proposed (multistage mixed-attention unsupervised keyword extraction for summary generation, MSMAUKE-S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i>). In the <i>N</i>-coarse-grained summary generation stages, a sentence filtering layer (PureText) is constructed to remove redundant information in long text. A mixed-attention unsupervised approach is used to iteratively extract keywords, assisting summary inference and enriching the global semantic information of coarse-grained summaries. In the 1-fine-grained summary generation stage, a self-attentive keyword selection module (KeywordSelect) is designed to obtain keywords with higher weights and enhance the local semantic representation of fine-grained summaries. Tandem <i>N</i>-coarse-grained and 1-fine-grained summary generation stages are used to obtain long text summaries through a multistage generation approach. The experimental results show that the model improves the ROUGE-1, ROUGE-2, and ROUGE-L metrics by a minimum of 0.75%, 1.48%, and 1.25% over the HMNET, TextRank, HAT-BART, DDAMS, and S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i> models on summarized datasets such as AMI, ICSI, and QMSum. |
first_indexed | 2024-04-24T18:35:25Z |
format | Article |
id | doaj.art-3f3ac65d379c49e79d46dc3d77b42ddf |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-04-24T18:35:25Z |
publishDate | 2024-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-3f3ac65d379c49e79d46dc3d77b42ddf2024-03-27T13:19:41ZengMDPI AGApplied Sciences2076-34172024-03-01146243510.3390/app14062435Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary GenerationDi Wu0Peng Cheng1Yuying Zheng2School of Information and Electronic Engineering, Hebei University of Engineering, No. 19 Taiji Road, Handan 056000, ChinaSchool of Information and Electronic Engineering, Hebei University of Engineering, No. 19 Taiji Road, Handan 056000, ChinaSchool of Information and Electronic Engineering, Hebei University of Engineering, No. 19 Taiji Road, Handan 056000, ChinaSummary generation is an important research direction in natural language processing. Aimed at the problems of redundant information processing difficulties and an inability to generate high-quality summaries from long text in existing summary generation models, BART is the backbone model, an <i>N</i> + 1 coarse–fine-grained multistage summary generation framework is constructed, and a multistage mixed-attention unsupervised keyword extraction summary generation model is proposed (multistage mixed-attention unsupervised keyword extraction for summary generation, MSMAUKE-S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i>). In the <i>N</i>-coarse-grained summary generation stages, a sentence filtering layer (PureText) is constructed to remove redundant information in long text. A mixed-attention unsupervised approach is used to iteratively extract keywords, assisting summary inference and enriching the global semantic information of coarse-grained summaries. In the 1-fine-grained summary generation stage, a self-attentive keyword selection module (KeywordSelect) is designed to obtain keywords with higher weights and enhance the local semantic representation of fine-grained summaries. Tandem <i>N</i>-coarse-grained and 1-fine-grained summary generation stages are used to obtain long text summaries through a multistage generation approach. The experimental results show that the model improves the ROUGE-1, ROUGE-2, and ROUGE-L metrics by a minimum of 0.75%, 1.48%, and 1.25% over the HMNET, TextRank, HAT-BART, DDAMS, and S<span style="font-variant: small-caps;">umm</span><i><sup>N</sup></i> models on summarized datasets such as AMI, ICSI, and QMSum.https://www.mdpi.com/2076-3417/14/6/2435summary generationmultistagemixed-attentionunsupervisedkeyword extraction |
spellingShingle | Di Wu Peng Cheng Yuying Zheng Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation Applied Sciences summary generation multistage mixed-attention unsupervised keyword extraction |
title | Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation |
title_full | Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation |
title_fullStr | Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation |
title_full_unstemmed | Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation |
title_short | Multistage Mixed-Attention Unsupervised Keyword Extraction for Summary Generation |
title_sort | multistage mixed attention unsupervised keyword extraction for summary generation |
topic | summary generation multistage mixed-attention unsupervised keyword extraction |
url | https://www.mdpi.com/2076-3417/14/6/2435 |
work_keys_str_mv | AT diwu multistagemixedattentionunsupervisedkeywordextractionforsummarygeneration AT pengcheng multistagemixedattentionunsupervisedkeywordextractionforsummarygeneration AT yuyingzheng multistagemixedattentionunsupervisedkeywordextractionforsummarygeneration |