Toward a Low-Resource Non-Latin-Complete Baseline: An Exploration of Khmer Optical Character Recognition
Many existing text recognition methods rely on the structure of Latin characters and words. Such methods may not be able to deal with non-Latin scripts that have highly complex features, such as character stacking, diacritics, ligatures, non-uniform character widths, and writing without explicit wor...
Autors principals: | , , , |
---|---|
Format: | Article |
Idioma: | English |
Publicat: |
IEEE
2023-01-01
|
Col·lecció: | IEEE Access |
Matèries: | |
Accés en línia: | https://ieeexplore.ieee.org/document/10316307/ |