Exploiting Script Similarities to Compensate for the Large Amount of Data in Training Tesseract LSTM: Towards Kurdish OCR

Applications based on Long-Short-Term Memory (LSTM) require large amounts of data for their training. Tesseract LSTM is a popular Optical Character Recognition (OCR) engine that has been trained and used in various languages. However, its training becomes obstructed when the target language is not r...

Full description

Bibliographic Details
Main Authors: Saman Idrees, Hossein Hassani
Format: Article
Language:English
Published: MDPI AG 2021-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/20/9752