Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization

Attention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and d...

Full description

Bibliographic Details
Main Author: HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun
Format: Article
Language:zho
Published: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 2020-10-01
Series:Jisuanji kexue yu tansuo
Subjects:
Online Access:http://fcst.ceaj.org/CN/abstract/abstract2402.shtml
_version_ 1818910249687449600
author HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun
author_facet HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun
author_sort HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun
collection DOAJ
description Attention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and decoding model, the performance of the model is usually improved by stacking multi-layer decoders. However, the existing models only pay attention to the output of the last layer of the encoder when decoding, and ignore the information of other layers. In view of this, this paper proposes a novel abstractive text summarization model based on recurrent neural network and multi-layer interactive attention mechanism. The multi-layer interactive attention mechanism is introduced to extract contextual information from different levels of the encoder to guide the generation of abstracts. In order to deal with the problem of information redundancy caused by introducing different levels of context, the variational information bottleneck is adopted to compress data noise. Finally, this paper conducts experiments on Gigaword and DUC2004 datasets, and the results show that the proposed method achieves state of the art performance.
first_indexed 2024-12-19T22:39:48Z
format Article
id doaj.art-38a11aaf81694e0c8f21b1d43c685fbe
institution Directory Open Access Journal
issn 1673-9418
language zho
last_indexed 2024-12-19T22:39:48Z
publishDate 2020-10-01
publisher Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
record_format Article
series Jisuanji kexue yu tansuo
spelling doaj.art-38a11aaf81694e0c8f21b1d43c685fbe2022-12-21T20:03:06ZzhoJournal of Computer Engineering and Applications Beijing Co., Ltd., Science PressJisuanji kexue yu tansuo1673-94182020-10-0114101681169210.3778/j.issn.1673-9418.1909008Exploiting Multi-layer Interactive Attention for Abstractive Text SummarizationHUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun01. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China 2. Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650500, ChinaAttention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and decoding model, the performance of the model is usually improved by stacking multi-layer decoders. However, the existing models only pay attention to the output of the last layer of the encoder when decoding, and ignore the information of other layers. In view of this, this paper proposes a novel abstractive text summarization model based on recurrent neural network and multi-layer interactive attention mechanism. The multi-layer interactive attention mechanism is introduced to extract contextual information from different levels of the encoder to guide the generation of abstracts. In order to deal with the problem of information redundancy caused by introducing different levels of context, the variational information bottleneck is adopted to compress data noise. Finally, this paper conducts experiments on Gigaword and DUC2004 datasets, and the results show that the proposed method achieves state of the art performance.http://fcst.ceaj.org/CN/abstract/abstract2402.shtmltext summarizationencoding and decoding modelmulti-layer interactive attentionvariational infor-mation bottleneck
spellingShingle HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun
Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization
Jisuanji kexue yu tansuo
text summarization
encoding and decoding model
multi-layer interactive attention
variational infor-mation bottleneck
title Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization
title_full Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization
title_fullStr Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization
title_full_unstemmed Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization
title_short Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization
title_sort exploiting multi layer interactive attention for abstractive text summarization
topic text summarization
encoding and decoding model
multi-layer interactive attention
variational infor-mation bottleneck
url http://fcst.ceaj.org/CN/abstract/abstract2402.shtml
work_keys_str_mv AT huangyuxinyuzhengtaoxiangyangaoshengxiangguojunjun exploitingmultilayerinteractiveattentionforabstractivetextsummarization