Exploiting Multi-layer Interactive Attention for Abstractive Text Summarization

Attention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and d...

Full description

Bibliographic Details
Main Author: HUANG Yuxin, YU Zhengtao, XIANG Yan, GAO Shengxiang, GUO Junjun
Format: Article
Language:zho
Published: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 2020-10-01
Series:Jisuanji kexue yu tansuo
Subjects:
Online Access:http://fcst.ceaj.org/CN/abstract/abstract2402.shtml
Description
Summary:Attention-based encoding and decoding models have been widely used in text abstracts, machine translation and other sequence-to-sequence tasks. In deep learning framework, multi-layer neural network can obtain different feature representations of input data. Therefore, in conventional encoding and decoding model, the performance of the model is usually improved by stacking multi-layer decoders. However, the existing models only pay attention to the output of the last layer of the encoder when decoding, and ignore the information of other layers. In view of this, this paper proposes a novel abstractive text summarization model based on recurrent neural network and multi-layer interactive attention mechanism. The multi-layer interactive attention mechanism is introduced to extract contextual information from different levels of the encoder to guide the generation of abstracts. In order to deal with the problem of information redundancy caused by introducing different levels of context, the variational information bottleneck is adopted to compress data noise. Finally, this paper conducts experiments on Gigaword and DUC2004 datasets, and the results show that the proposed method achieves state of the art performance.
ISSN:1673-9418