Automatic construction of real‐world‐based typing‐error test dataset

Abstract In this study, we aim to automatically construct a test dataset for testing the performance of spelling error correction systems. The Google Web 1T corpus, which includes data on 10 quadrillion phrases, is used for this purpose. Therefore, error words used in the test dataset use error word...

Full description

Bibliographic Details
Main Authors: Jung‐Hun Lee, Hyuk‐Chul Kwon
Format: Article
Language:English
Published: Wiley 2022-07-01
Series:Electronics Letters
Online Access:https://doi.org/10.1049/ell2.12515