Supervised Copy Mechanism for Grammatical Error Correction
AI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled da...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10181308/ |
_version_ | 1797774860913475584 |
---|---|
author | Kamal Al-Sabahi Kang Yang |
author_facet | Kamal Al-Sabahi Kang Yang |
author_sort | Kamal Al-Sabahi |
collection | DOAJ |
description | AI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled datasets are often too small to learn everything automatically. One of the key principles of GEC is to preserve correct parts of the input text while correcting grammatical errors. However, previous sequence-to-sequence (Seq2Seq) models may be prone to over-correction as they generate corrections from scratch. Over-correction is a phenomenon where a grammatically correct sentence is incorrectly flagged as containing errors that require correction, leading to incorrect corrections that can change the meaning or structure of the original sentence. This can significantly reduce the accuracy and usefulness of GEC systems, highlighting the need for improved approaches that can reduce over-correction and ensure more accurate and natural corrections. Recently, sequence tagging-based models have been used to mitigate this issue by only predicting edit operations that convert the source sentence to a corrected one. Despite their good performance on datasets with minimal edits, they struggle to restore texts with drastic changes. This issue artificially restricts the type of changes that can be made to a sentence and does not reflect those required for native speakers to find sentences fluent or natural sounding. Moreover, sequence tagging-based models are usually conditioned on human-designed language-specific tagging labels, hindering generalization and the real error distribution generated by diverse learners from different nationalities. In this work, we introduce a novel Seq2Seq-based approach that can handle a wide variety of grammatical errors on a low-fluency dataset. Our approach enhances the Seq2Seq architecture with a novel copy mechanism based on a supervised attention approach. Instead of merely predicting the next token in context, the model predicts additional correctness-related information for each token. This auxiliary objective propagates into the weights of the model during training without requiring extra labels at testing time. Experimental results on benchmark datasets show that our model achieves competitive performance compared to state-of-the-art(SOTA) models. |
first_indexed | 2024-03-12T22:27:22Z |
format | Article |
id | doaj.art-1b113e8cb1564bf981e32308baaa950a |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-12T22:27:22Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-1b113e8cb1564bf981e32308baaa950a2023-07-21T23:01:02ZengIEEEIEEE Access2169-35362023-01-0111723747238310.1109/ACCESS.2023.329497910181308Supervised Copy Mechanism for Grammatical Error CorrectionKamal Al-Sabahi0https://orcid.org/0000-0001-5554-9533Kang Yang1College of Computing and Information Sciences, University of Technology and Applied Sciences-Ibra, Ibra, OmanKey Laboratory of Software Engineering for Complex Systems, College of Computer, National University of Defense Technology, Changsha, ChinaAI has introduced a new reform direction for traditional education, such as automating Grammatical Error Correction (GEC) to reduce teachers’ workload and improve efficiency. However, current GEC models still have flaws because human language is very variable, and the available labeled datasets are often too small to learn everything automatically. One of the key principles of GEC is to preserve correct parts of the input text while correcting grammatical errors. However, previous sequence-to-sequence (Seq2Seq) models may be prone to over-correction as they generate corrections from scratch. Over-correction is a phenomenon where a grammatically correct sentence is incorrectly flagged as containing errors that require correction, leading to incorrect corrections that can change the meaning or structure of the original sentence. This can significantly reduce the accuracy and usefulness of GEC systems, highlighting the need for improved approaches that can reduce over-correction and ensure more accurate and natural corrections. Recently, sequence tagging-based models have been used to mitigate this issue by only predicting edit operations that convert the source sentence to a corrected one. Despite their good performance on datasets with minimal edits, they struggle to restore texts with drastic changes. This issue artificially restricts the type of changes that can be made to a sentence and does not reflect those required for native speakers to find sentences fluent or natural sounding. Moreover, sequence tagging-based models are usually conditioned on human-designed language-specific tagging labels, hindering generalization and the real error distribution generated by diverse learners from different nationalities. In this work, we introduce a novel Seq2Seq-based approach that can handle a wide variety of grammatical errors on a low-fluency dataset. Our approach enhances the Seq2Seq architecture with a novel copy mechanism based on a supervised attention approach. Instead of merely predicting the next token in context, the model predicts additional correctness-related information for each token. This auxiliary objective propagates into the weights of the model during training without requiring extra labels at testing time. Experimental results on benchmark datasets show that our model achieves competitive performance compared to state-of-the-art(SOTA) models.https://ieeexplore.ieee.org/document/10181308/Supervised attentionsupervised copy mechanismgrammatical error correctionsequence-to-sequence |
spellingShingle | Kamal Al-Sabahi Kang Yang Supervised Copy Mechanism for Grammatical Error Correction IEEE Access Supervised attention supervised copy mechanism grammatical error correction sequence-to-sequence |
title | Supervised Copy Mechanism for Grammatical Error Correction |
title_full | Supervised Copy Mechanism for Grammatical Error Correction |
title_fullStr | Supervised Copy Mechanism for Grammatical Error Correction |
title_full_unstemmed | Supervised Copy Mechanism for Grammatical Error Correction |
title_short | Supervised Copy Mechanism for Grammatical Error Correction |
title_sort | supervised copy mechanism for grammatical error correction |
topic | Supervised attention supervised copy mechanism grammatical error correction sequence-to-sequence |
url | https://ieeexplore.ieee.org/document/10181308/ |
work_keys_str_mv | AT kamalalsabahi supervisedcopymechanismforgrammaticalerrorcorrection AT kangyang supervisedcopymechanismforgrammaticalerrorcorrection |