End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning

Recognizing handwritten text is a challenging task, especially for scripts with numerous alphabets and symbols. The Ethiopic script has a vast character set and is used for historical documents in typewritten, handwritten, and hand-printed forms. However, despite its importance as an ancient script,...

Full description

Bibliographic Details
Main Authors: Ruchika Malhotra, Maru Tesfaye Addis
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10247013/
_version_ 1797680894824153088
author Ruchika Malhotra
Maru Tesfaye Addis
author_facet Ruchika Malhotra
Maru Tesfaye Addis
author_sort Ruchika Malhotra
collection DOAJ
description Recognizing handwritten text is a challenging task, especially for scripts with numerous alphabets and symbols. The Ethiopic script has a vast character set and is used for historical documents in typewritten, handwritten, and hand-printed forms. However, despite its importance as an ancient script, optical character recognition research has not given enough attention to Ethiopic text recognition. In recent years, deep learning (DL) has emerged as a powerful technique for recognizing patterns. In this study, a DL approach is used to recognize historical Ethiopic handwritten texts. The recognition model employs an end-to-end strategy enabling sequential feature extraction and efficient recognition. An attention mechanism coupled with a connectionist temporal classification architecture is the core of this recognition model architecture. In addition, there are seven convolutional neural networks and two recurrent neural networks. We increase the training data using data augmentation techniques to address the data scarcity common in deep learning applications. The experiments include an original training dataset of 79,684 historical handwritten images and an augmented dataset of 10,000 images containing Ethiopic texts. The model used for recognition showed promising results. For “Test Set I” which had 6,150 samples, the character error rate (CER) was 17.95%, and for “Test Set II” which had 15,935 samples, the CER was 29.95%. These outcomes indicate that this approach has the potential to improve the recognition of historical handwritten Ethiopic text.
first_indexed 2024-03-11T23:36:53Z
format Article
id doaj.art-653405bb9a2c43f2b38651f526aeac5d
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-11T23:36:53Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-653405bb9a2c43f2b38651f526aeac5d2023-09-19T23:01:30ZengIEEEIEEE Access2169-35362023-01-0111995359954510.1109/ACCESS.2023.331433410247013End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep LearningRuchika Malhotra0Maru Tesfaye Addis1https://orcid.org/0000-0002-9416-9063Department of Software Engineering, Delhi Technological University, Delhi, IndiaDepartment of Computer Science and Engineering, Delhi Technological University, Delhi, IndiaRecognizing handwritten text is a challenging task, especially for scripts with numerous alphabets and symbols. The Ethiopic script has a vast character set and is used for historical documents in typewritten, handwritten, and hand-printed forms. However, despite its importance as an ancient script, optical character recognition research has not given enough attention to Ethiopic text recognition. In recent years, deep learning (DL) has emerged as a powerful technique for recognizing patterns. In this study, a DL approach is used to recognize historical Ethiopic handwritten texts. The recognition model employs an end-to-end strategy enabling sequential feature extraction and efficient recognition. An attention mechanism coupled with a connectionist temporal classification architecture is the core of this recognition model architecture. In addition, there are seven convolutional neural networks and two recurrent neural networks. We increase the training data using data augmentation techniques to address the data scarcity common in deep learning applications. The experiments include an original training dataset of 79,684 historical handwritten images and an augmented dataset of 10,000 images containing Ethiopic texts. The model used for recognition showed promising results. For “Test Set I” which had 6,150 samples, the character error rate (CER) was 17.95%, and for “Test Set II” which had 15,935 samples, the CER was 29.95%. These outcomes indicate that this approach has the potential to improve the recognition of historical handwritten Ethiopic text.https://ieeexplore.ieee.org/document/10247013/Deep learningend-to-end learningethiopic scripthandwritten text recognitionpattern recognition
spellingShingle Ruchika Malhotra
Maru Tesfaye Addis
End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
IEEE Access
Deep learning
end-to-end learning
ethiopic script
handwritten text recognition
pattern recognition
title End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
title_full End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
title_fullStr End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
title_full_unstemmed End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
title_short End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
title_sort end to end historical handwritten ethiopic text recognition using deep learning
topic Deep learning
end-to-end learning
ethiopic script
handwritten text recognition
pattern recognition
url https://ieeexplore.ieee.org/document/10247013/
work_keys_str_mv AT ruchikamalhotra endtoendhistoricalhandwrittenethiopictextrecognitionusingdeeplearning
AT marutesfayeaddis endtoendhistoricalhandwrittenethiopictextrecognitionusingdeeplearning