Abstractive summarization model considering hybrid lexical features

In order to use lexical features (including n-gram and part of speech information) to identify more key vocabulary content in the summarization generation process to further improve the quality of the summarization, an algorithm based on sequence-to-sequence (Seq2Seq) structure and attention mechani...

Full description

Bibliographic Details
Main Authors: Yuehua JIANG, Lei DING, Jiaoe LI, Haoxuan DU, Kai GAO
Format: Article
Language:zho
Published: Hebei University of Science and Technology 2019-04-01
Series:Journal of Hebei University of Science and Technology
Subjects:
Online Access:http://xuebao.hebust.edu.cn/hbkjdx/ch/reader/create_pdf.aspx?file_no=b201902009&flag=1&journal_
_version_ 1828848488594538496
author Yuehua JIANG
Lei DING
Jiaoe LI
Haoxuan DU
Kai GAO
author_facet Yuehua JIANG
Lei DING
Jiaoe LI
Haoxuan DU
Kai GAO
author_sort Yuehua JIANG
collection DOAJ
description In order to use lexical features (including n-gram and part of speech information) to identify more key vocabulary content in the summarization generation process to further improve the quality of the summarization, an algorithm based on sequence-to-sequence (Seq2Seq) structure and attention mechanism and combining lexical features is proposed. The input layer of the algorithm combines the part of speech vector with the word vector, which is the input of the encoder layer. The encoder layer is composed of bi-directional LSTM, and the context vector is composed of the output of the encoder and the lexical feature vector extracted from the convolution neural network. The convolutional neural network layer in the model controls the lexical information, the bi-directional LSTM controls the sentence information, and the decoder layer uses unidirectional LSTM to decode the context vector and generates the summarization. The experiments on public dataset and the self-collected dataset show that the performance of the summarization generation model considering lexical feature is better than that of the contrast model. The ROUGE-1, ROUGE-2 and ROUGE-L scores on the public dataset are improved by 0.024, 0.033 and 0.030, respectively. Therefore, the generation of summarization is not only related to the semantics and themes of the article, but also to the lexical features.The proposed model provides a certain reference value in the research of generating summarization of integrating key infromation.
first_indexed 2024-12-12T22:31:07Z
format Article
id doaj.art-e54d53649e3742658c532cf055698970
institution Directory Open Access Journal
issn 1008-1542
language zho
last_indexed 2024-12-12T22:31:07Z
publishDate 2019-04-01
publisher Hebei University of Science and Technology
record_format Article
series Journal of Hebei University of Science and Technology
spelling doaj.art-e54d53649e3742658c532cf0556989702022-12-22T00:09:36ZzhoHebei University of Science and TechnologyJournal of Hebei University of Science and Technology1008-15422019-04-0140215215810.7535/hbkd.2019yx02009b201902009Abstractive summarization model considering hybrid lexical featuresYuehua JIANG0Lei DING1Jiaoe LI2Haoxuan DU3Kai GAO4School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaInformation Center of Shijiazhuang Public Security Bureau, Shijiazhuang, Hebei 050021, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaXi'dian University, Xi'an, Shaanxi 710126, ChinaSchool of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, Hebei 050018, ChinaIn order to use lexical features (including n-gram and part of speech information) to identify more key vocabulary content in the summarization generation process to further improve the quality of the summarization, an algorithm based on sequence-to-sequence (Seq2Seq) structure and attention mechanism and combining lexical features is proposed. The input layer of the algorithm combines the part of speech vector with the word vector, which is the input of the encoder layer. The encoder layer is composed of bi-directional LSTM, and the context vector is composed of the output of the encoder and the lexical feature vector extracted from the convolution neural network. The convolutional neural network layer in the model controls the lexical information, the bi-directional LSTM controls the sentence information, and the decoder layer uses unidirectional LSTM to decode the context vector and generates the summarization. The experiments on public dataset and the self-collected dataset show that the performance of the summarization generation model considering lexical feature is better than that of the contrast model. The ROUGE-1, ROUGE-2 and ROUGE-L scores on the public dataset are improved by 0.024, 0.033 and 0.030, respectively. Therefore, the generation of summarization is not only related to the semantics and themes of the article, but also to the lexical features.The proposed model provides a certain reference value in the research of generating summarization of integrating key infromation.http://xuebao.hebust.edu.cn/hbkjdx/ch/reader/create_pdf.aspx?file_no=b201902009&flag=1&journal_natural language processingtext summarizationattention mechanismLSTMCNN
spellingShingle Yuehua JIANG
Lei DING
Jiaoe LI
Haoxuan DU
Kai GAO
Abstractive summarization model considering hybrid lexical features
Journal of Hebei University of Science and Technology
natural language processing
text summarization
attention mechanism
LSTM
CNN
title Abstractive summarization model considering hybrid lexical features
title_full Abstractive summarization model considering hybrid lexical features
title_fullStr Abstractive summarization model considering hybrid lexical features
title_full_unstemmed Abstractive summarization model considering hybrid lexical features
title_short Abstractive summarization model considering hybrid lexical features
title_sort abstractive summarization model considering hybrid lexical features
topic natural language processing
text summarization
attention mechanism
LSTM
CNN
url http://xuebao.hebust.edu.cn/hbkjdx/ch/reader/create_pdf.aspx?file_no=b201902009&flag=1&journal_
work_keys_str_mv AT yuehuajiang abstractivesummarizationmodelconsideringhybridlexicalfeatures
AT leiding abstractivesummarizationmodelconsideringhybridlexicalfeatures
AT jiaoeli abstractivesummarizationmodelconsideringhybridlexicalfeatures
AT haoxuandu abstractivesummarizationmodelconsideringhybridlexicalfeatures
AT kaigao abstractivesummarizationmodelconsideringhybridlexicalfeatures