Toward a Low-Resource Non-Latin-Complete Baseline: An Exploration of Khmer Optical Character Recognition

Many existing text recognition methods rely on the structure of Latin characters and words. Such methods may not be able to deal with non-Latin scripts that have highly complex features, such as character stacking, diacritics, ligatures, non-uniform character widths, and writing without explicit wor...

Descripció completa

Dades bibliogràfiques
Autors principals: Rina Buoy, Masakazu Iwamura, Sovila Srun, Koichi Kise
Format: Article
Idioma:English
Publicat: IEEE 2023-01-01
Col·lecció:IEEE Access
Matèries:
Accés en línia:https://ieeexplore.ieee.org/document/10316307/