Toward a Low-Resource Non-Latin-Complete Baseline: An Exploration of Khmer Optical Character Recognition

Many existing text recognition methods rely on the structure of Latin characters and words. Such methods may not be able to deal with non-Latin scripts that have highly complex features, such as character stacking, diacritics, ligatures, non-uniform character widths, and writing without explicit wor...

Descripció completa

Dades bibliogràfiques
Autors principals:	Rina Buoy, Masakazu Iwamura, Sovila Srun, Koichi Kise
Format:	Article
Idioma:	English
Publicat:	IEEE 2023-01-01
Col·lecció:	IEEE Access
Matèries:	Khmer script non-Latin scripts character stacking no explicit word boundaries text recognition image chunking
Accés en línia:	https://ieeexplore.ieee.org/document/10316307/

Internet

https://ieeexplore.ieee.org/document/10316307/

Toward a Low-Resource Non-Latin-Complete Baseline: An Exploration of Khmer Optical Character Recognition

Internet

Ítems similars