Warm-Starting for Improving the Novelty of Abstractive Summarization

Abstractive summarization is distinguished by using novel phrases that are not found in the source text. However, most previous research ignores this feature in favour of enhancing syntactical similarity with the reference. To improve novelty aspects, we have used multiple warm-started models with v...

Full description

Bibliographic Details
Main Authors: Ayham Alomari, Ahmad Sami Al-Shamayleh, Norisma Idris, Aznul Qalid Md Sabri, Izzat Alsmadi, Danah Omary
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10272590/
Description
Summary:Abstractive summarization is distinguished by using novel phrases that are not found in the source text. However, most previous research ignores this feature in favour of enhancing syntactical similarity with the reference. To improve novelty aspects, we have used multiple warm-started models with varying encoder and decoder checkpoints and vocabulary. These models are then adapted to the paraphrasing task and the sampling decoding strategy to further boost the levels of novelty and quality. In addition, to avoid relying only on the syntactical similarity assessment, two additional abstractive summarization metrics are introduced: 1) NovScore: a new novelty metric that delivers a summary novelty score; and 2) NSSF: a new comprehensive metric that ensembles Novelty, Syntactic, Semantic, and Faithfulness features into a single score to simulate human assessment in providing a reliable evaluation. Finally, we compare our models to the state-of-the-art sequence-to-sequence models using the current and the proposed metrics. As a result, warm-starting, sampling, and paraphrasing improve novelty degrees by 2%, 5%, and 14%, respectively, while maintaining comparable scores on other metrics.
ISSN:2169-3536