Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization
Attention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and d...
Main Author: | |
---|---|
Format: | Article |
Language: | zho |
Published: |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
2020-10-01
|
Series: | Jisuanji kexue yu tansuo |
Subjects: | |
Online Access: | http://fcst.ceaj.org/CN/abstract/abstract2402.shtml |
_version_ | 1818910249687449600 |
---|---|
author | HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun |
author_facet | HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun |
author_sort | HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun |
collection | DOAJ |
description | Attention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and decoding model, the performance of the model is usually improved by stacking multi-layer decoders. However, the existing models only pay attention to the output of the last layer of the encoder when decoding, and ignore the information of other layers. In view of this, this paper proposes a novel abstractive text summarization model based on recurrent neural network and multi-layer interactive attention mechanism. The multi-layer interactive attention mechanism is introduced to extract contextual information from different levels of the encoder to guide the generation of abstracts. In order to deal with the problem of information redundancy caused by introducing different levels of context, the variational information bottleneck is adopted to compress data noise. Finally, this paper conducts experiments on Gigaword and DUC2004 datasets, and the results show that the proposed method achieves state of the art performance. |
first_indexed | 2024-12-19T22:39:48Z |
format | Article |
id | doaj.art-38a11aaf81694e0c8f21b1d43c685fbe |
institution | Directory Open Access Journal |
issn | 1673-9418 |
language | zho |
last_indexed | 2024-12-19T22:39:48Z |
publishDate | 2020-10-01 |
publisher | Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press |
record_format | Article |
series | Jisuanji kexue yu tansuo |
spelling | doaj.art-38a11aaf81694e0c8f21b1d43c685fbe2022-12-21T20:03:06ZzhoJournal of Computer Engineering and Applications Beijing Co., Ltd., Science PressJisuanji kexue yu tansuo1673-94182020-10-0114101681169210.3778/j.issn.1673-9418.1909008Exploiting Multi-layer Interactive Attention for Abstractive Text SummarizationHUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun01. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China 2. Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650500, ChinaAttention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and decoding model, the performance of the model is usually improved by stacking multi-layer decoders. However, the existing models only pay attention to the output of the last layer of the encoder when decoding, and ignore the information of other layers. In view of this, this paper proposes a novel abstractive text summarization model based on recurrent neural network and multi-layer interactive attention mechanism. The multi-layer interactive attention mechanism is introduced to extract contextual information from different levels of the encoder to guide the generation of abstracts. In order to deal with the problem of information redundancy caused by introducing different levels of context, the variational information bottleneck is adopted to compress data noise. Finally, this paper conducts experiments on Gigaword and DUC2004 datasets, and the results show that the proposed method achieves state of the art performance.http://fcst.ceaj.org/CN/abstract/abstract2402.shtmltext summarizationencoding and decoding modelmulti-layer interactive attentionvariational infor-mation bottleneck |
spellingShingle | HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization Jisuanji kexue yu tansuo text summarization encoding and decoding model multi-layer interactive attention variational infor-mation bottleneck |
title | Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization |
title_full | Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization |
title_fullStr | Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization |
title_full_unstemmed | Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization |
title_short | Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization |
title_sort | exploiting multi layer interactive attention for abstractive text summarization |
topic | text summarization encoding and decoding model multi-layer interactive attention variational infor-mation bottleneck |
url | http://fcst.ceaj.org/CN/abstract/abstract2402.shtml |
work_keys_str_mv | AT huangyuxinyuzhengtaoxiangyangaoshengxiangguojunjun exploitingmultilayerinteractiveattentionforabstractivetextsummarization |