Supervised Copy Mechanism for Grammatical Error Correction

AI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled da...

Full description

Bibliographic Details
Main Authors:	Kamal Al-Sabahi, Kang Yang
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Supervised attention supervised copy mechanism grammatical error correction sequence-to-sequence
Online Access:	https://ieeexplore.ieee.org/document/10181308/

_version_	1797774860913475584
author	Kamal Al-Sabahi Kang Yang
author_facet	Kamal Al-Sabahi Kang Yang
author_sort	Kamal Al-Sabahi
collection	DOAJ
description	AI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled datasets are often too small to learn everything automatically. One of the key principles of GEC is to preserve correct parts of the input text while correcting grammatical errors. However, previous sequence-to-sequence (Seq2Seq) models may be prone to over-correction as they generate corrections from scratch. Over-correction is a phenomenon where a grammatically correct sentence is incorrectly flagged as containing errors that require correction, leading to incorrect corrections that can change the meaning or structure of the original sentence. This can significantly reduce the accuracy and usefulness of GEC systems, highlighting the need for improved approaches that can reduce over-correction and ensure more accurate and natural corrections. Recently, sequence tagging-based models have been used to mitigate this issue by only predicting edit operations that convert the source sentence to a corrected one. Despite their good performance on datasets with minimal edits, they struggle to restore texts with drastic changes. This issue artificially restricts the type of changes that can be made to a sentence and does not reflect those required for native speakers to find sentences fluent or natural sounding. Moreover, sequence tagging-based models are usually conditioned on human-designed language-specific tagging labels, hindering generalization and the real error distribution generated by diverse learners from different nationalities. In this work, we introduce a novel Seq2Seq-based approach that can handle a wide variety of grammatical errors on a low-fluency dataset. Our approach enhances the Seq2Seq architecture with a novel copy mechanism based on a supervised attention approach. Instead of merely predicting the next token in context, the model predicts additional correctness-related information for each token. This auxiliary objective propagates into the weights of the model during training without requiring extra labels at testing time. Experimental results on benchmark datasets show that our model achieves competitive performance compared to state-of-the-art(SOTA) models.
first_indexed	2024-03-12T22:27:22Z
format	Article
id	doaj.art-1b113e8cb1564bf981e32308baaa950a
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-12T22:27:22Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-1b113e8cb1564bf981e32308baaa950a2023-07-21T23:01:02ZengIEEEIEEE Access2169-35362023-01-0111723747238310.1109/ACCESS.2023.329497910181308Supervised Copy Mechanism for Grammatical Error CorrectionKamal Al-Sabahi0https://orcid.org/0000-0001-5554-9533Kang Yang1College of Computing and Information Sciences, University of Technology and Applied Sciences-Ibra, Ibra, OmanKey Laboratory of Software Engineering for Complex Systems, College of Computer, National University of Defense Technology, Changsha, ChinaAI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled datasets are often too small to learn everything automatically. One of the key principles of GEC is to preserve correct parts of the input text while correcting grammatical errors. However, previous sequence-to-sequence (Seq2Seq) models may be prone to over-correction as they generate corrections from scratch. Over-correction is a phenomenon where a grammatically correct sentence is incorrectly flagged as containing errors that require correction, leading to incorrect corrections that can change the meaning or structure of the original sentence. This can significantly reduce the accuracy and usefulness of GEC systems, highlighting the need for improved approaches that can reduce over-correction and ensure more accurate and natural corrections. Recently, sequence tagging-based models have been used to mitigate this issue by only predicting edit operations that convert the source sentence to a corrected one. Despite their good performance on datasets with minimal edits, they struggle to restore texts with drastic changes. This issue artificially restricts the type of changes that can be made to a sentence and does not reflect those required for native speakers to find sentences fluent or natural sounding. Moreover, sequence tagging-based models are usually conditioned on human-designed language-specific tagging labels, hindering generalization and the real error distribution generated by diverse learners from different nationalities. In this work, we introduce a novel Seq2Seq-based approach that can handle a wide variety of grammatical errors on a low-fluency dataset. Our approach enhances the Seq2Seq architecture with a novel copy mechanism based on a supervised attention approach. Instead of merely predicting the next token in context, the model predicts additional correctness-related information for each token. This auxiliary objective propagates into the weights of the model during training without requiring extra labels at testing time. Experimental results on benchmark datasets show that our model achieves competitive performance compared to state-of-the-art(SOTA) models.https://ieeexplore.ieee.org/document/10181308/Supervised attentionsupervised copy mechanismgrammatical error correctionsequence-to-sequence
spellingShingle	Kamal Al-Sabahi Kang Yang Supervised Copy Mechanism for Grammatical Error Correction IEEE Access Supervised attention supervised copy mechanism grammatical error correction sequence-to-sequence
title	Supervised Copy Mechanism for Grammatical Error Correction
title_full	Supervised Copy Mechanism for Grammatical Error Correction
title_fullStr	Supervised Copy Mechanism for Grammatical Error Correction
title_full_unstemmed	Supervised Copy Mechanism for Grammatical Error Correction
title_short	Supervised Copy Mechanism for Grammatical Error Correction
title_sort	supervised copy mechanism for grammatical error correction
topic	Supervised attention supervised copy mechanism grammatical error correction sequence-to-sequence
url	https://ieeexplore.ieee.org/document/10181308/
work_keys_str_mv	AT kamalalsabahi supervisedcopymechanismforgrammaticalerrorcorrection AT kangyang supervisedcopymechanismforgrammaticalerrorcorrection

Supervised Copy Mechanism for Grammatical Error Correction

Similar Items