Automatic construction of real‐world‐based typing‐error test dataset
Abstract In this study, we aim to automatically construct a test dataset for testing the performance of spelling error correction systems. The Google Web 1T corpus, which includes data on 10 quadrillion phrases, is used for this purpose. Therefore, error words used in the test dataset use error word...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2022-07-01
|
Series: | Electronics Letters |
Online Access: | https://doi.org/10.1049/ell2.12515 |