Supervised Copy Mechanism for Grammatical Error Correction

AI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled da...

Full description

Bibliographic Details
Main Authors: Kamal Al-Sabahi, Kang Yang
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10181308/
_version_ 1797774860913475584
author Kamal Al-Sabahi
Kang Yang
author_facet Kamal Al-Sabahi
Kang Yang
author_sort Kamal Al-Sabahi
collection DOAJ
description AI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled datasets are often too small to learn everything automatically. One of the key principles of GEC is to preserve correct parts of the input text while correcting grammatical errors. However, previous sequence-to-sequence (Seq2Seq) models may be prone to over-correction as they generate corrections from scratch. Over-correction is a phenomenon where a grammatically correct sentence is incorrectly flagged as containing errors that require correction, leading to incorrect corrections that can change the meaning or structure of the original sentence. This can significantly reduce the accuracy and usefulness of GEC systems, highlighting the need for improved approaches that can reduce over-correction and ensure more accurate and natural corrections. Recently, sequence tagging-based models have been used to mitigate this issue by only predicting edit operations that convert the source sentence to a corrected one. Despite their good performance on datasets with minimal edits, they struggle to restore texts with drastic changes. This issue artificially restricts the type of changes that can be made to a sentence and does not reflect those required for native speakers to find sentences fluent or natural sounding. Moreover, sequence tagging-based models are usually conditioned on human-designed language-specific tagging labels, hindering generalization and the real error distribution generated by diverse learners from different nationalities. In this work, we introduce a novel Seq2Seq-based approach that can handle a wide variety of grammatical errors on a low-fluency dataset. Our approach enhances the Seq2Seq architecture with a novel copy mechanism based on a supervised attention approach. Instead of merely predicting the next token in context, the model predicts additional correctness-related information for each token. This auxiliary objective propagates into the weights of the model during training without requiring extra labels at testing time. Experimental results on benchmark datasets show that our model achieves competitive performance compared to state-of-the-art(SOTA) models.
first_indexed 2024-03-12T22:27:22Z
format Article
id doaj.art-1b113e8cb1564bf981e32308baaa950a
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-12T22:27:22Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-1b113e8cb1564bf981e32308baaa950a2023-07-21T23:01:02ZengIEEEIEEE Access2169-35362023-01-0111723747238310.1109/ACCESS.2023.329497910181308Supervised Copy Mechanism for Grammatical Error CorrectionKamal Al-Sabahi0https://orcid.org/0000-0001-5554-9533Kang Yang1College of Computing and Information Sciences, University of Technology and Applied Sciences-Ibra, Ibra, OmanKey Laboratory of Software Engineering for Complex Systems, College of Computer, National University of Defense Technology, Changsha, ChinaAI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled datasets are often too small to learn everything automatically. One of the key principles of GEC is to preserve correct parts of the input text while correcting grammatical errors. However, previous sequence-to-sequence (Seq2Seq) models may be prone to over-correction as they generate corrections from scratch. Over-correction is a phenomenon where a grammatically correct sentence is incorrectly flagged as containing errors that require correction, leading to incorrect corrections that can change the meaning or structure of the original sentence. This can significantly reduce the accuracy and usefulness of GEC systems, highlighting the need for improved approaches that can reduce over-correction and ensure more accurate and natural corrections. Recently, sequence tagging-based models have been used to mitigate this issue by only predicting edit operations that convert the source sentence to a corrected one. Despite their good performance on datasets with minimal edits, they struggle to restore texts with drastic changes. This issue artificially restricts the type of changes that can be made to a sentence and does not reflect those required for native speakers to find sentences fluent or natural sounding. Moreover, sequence tagging-based models are usually conditioned on human-designed language-specific tagging labels, hindering generalization and the real error distribution generated by diverse learners from different nationalities. In this work, we introduce a novel Seq2Seq-based approach that can handle a wide variety of grammatical errors on a low-fluency dataset. Our approach enhances the Seq2Seq architecture with a novel copy mechanism based on a supervised attention approach. Instead of merely predicting the next token in context, the model predicts additional correctness-related information for each token. This auxiliary objective propagates into the weights of the model during training without requiring extra labels at testing time. Experimental results on benchmark datasets show that our model achieves competitive performance compared to state-of-the-art(SOTA) models.https://ieeexplore.ieee.org/document/10181308/Supervised attentionsupervised copy mechanismgrammatical error correctionsequence-to-sequence
spellingShingle Kamal Al-Sabahi
Kang Yang
Supervised Copy Mechanism for Grammatical Error Correction
IEEE Access
Supervised attention
supervised copy mechanism
grammatical error correction
sequence-to-sequence
title Supervised Copy Mechanism for Grammatical Error Correction
title_full Supervised Copy Mechanism for Grammatical Error Correction
title_fullStr Supervised Copy Mechanism for Grammatical Error Correction
title_full_unstemmed Supervised Copy Mechanism for Grammatical Error Correction
title_short Supervised Copy Mechanism for Grammatical Error Correction
title_sort supervised copy mechanism for grammatical error correction
topic Supervised attention
supervised copy mechanism
grammatical error correction
sequence-to-sequence
url https://ieeexplore.ieee.org/document/10181308/
work_keys_str_mv AT kamalalsabahi supervisedcopymechanismforgrammaticalerrorcorrection
AT kangyang supervisedcopymechanismforgrammaticalerrorcorrection