Automatic construction of real‐world‐based typing‐error test dataset
Abstract In this study, we aim to automatically construct a test dataset for testing the performance of spelling error correction systems. The Google Web 1T corpus, which includes data on 10 quadrillion phrases, is used for this purpose. Therefore, error words used in the test dataset use error word...
Main Authors: | Jung‐Hun Lee, Hyuk‐Chul Kwon |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2022-07-01
|
Series: | Electronics Letters |
Online Access: | https://doi.org/10.1049/ell2.12515 |
Similar Items
-
Human–Robot Labeling Framework to Construct Multitype Real-World Datasets
by: Ahmed Elsharkawy, et al.
Published: (2022-01-01) -
Real and synthetic Punjabi speech datasets for automatic speech recognition
by: Satwinder Singh, et al.
Published: (2024-02-01) -
Automatic Correction of Real-Word Errors in Spanish Clinical Texts
by: Daniel Bravo-Candel, et al.
Published: (2021-04-01) -
SNOWED: Automatically Constructed Dataset of Satellite Imagery for Water Edge Measurements
by: Gregorio Andria, et al.
Published: (2023-05-01) -
A Comprehensive Real-World Photometric Stereo Dataset for Unsupervised Anomaly Detection
by: Junyong Jung, et al.
Published: (2022-01-01)