End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
Recognizing handwritten text is a challenging task, especially for scripts with numerous alphabets and symbols. The Ethiopic script has a vast character set and is used for historical documents in typewritten, handwritten, and hand-printed forms. However, despite its importance as an ancient script,...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10247013/ |
_version_ | 1797680894824153088 |
---|---|
author | Ruchika Malhotra Maru Tesfaye Addis |
author_facet | Ruchika Malhotra Maru Tesfaye Addis |
author_sort | Ruchika Malhotra |
collection | DOAJ |
description | Recognizing handwritten text is a challenging task, especially for scripts with numerous alphabets and symbols. The Ethiopic script has a vast character set and is used for historical documents in typewritten, handwritten, and hand-printed forms. However, despite its importance as an ancient script, optical character recognition research has not given enough attention to Ethiopic text recognition. In recent years, deep learning (DL) has emerged as a powerful technique for recognizing patterns. In this study, a DL approach is used to recognize historical Ethiopic handwritten texts. The recognition model employs an end-to-end strategy enabling sequential feature extraction and efficient recognition. An attention mechanism coupled with a connectionist temporal classification architecture is the core of this recognition model architecture. In addition, there are seven convolutional neural networks and two recurrent neural networks. We increase the training data using data augmentation techniques to address the data scarcity common in deep learning applications. The experiments include an original training dataset of 79,684 historical handwritten images and an augmented dataset of 10,000 images containing Ethiopic texts. The model used for recognition showed promising results. For “Test Set I” which had 6,150 samples, the character error rate (CER) was 17.95%, and for “Test Set II” which had 15,935 samples, the CER was 29.95%. These outcomes indicate that this approach has the potential to improve the recognition of historical handwritten Ethiopic text. |
first_indexed | 2024-03-11T23:36:53Z |
format | Article |
id | doaj.art-653405bb9a2c43f2b38651f526aeac5d |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-11T23:36:53Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-653405bb9a2c43f2b38651f526aeac5d2023-09-19T23:01:30ZengIEEEIEEE Access2169-35362023-01-0111995359954510.1109/ACCESS.2023.331433410247013End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep LearningRuchika Malhotra0Maru Tesfaye Addis1https://orcid.org/0000-0002-9416-9063Department of Software Engineering, Delhi Technological University, Delhi, IndiaDepartment of Computer Science and Engineering, Delhi Technological University, Delhi, IndiaRecognizing handwritten text is a challenging task, especially for scripts with numerous alphabets and symbols. The Ethiopic script has a vast character set and is used for historical documents in typewritten, handwritten, and hand-printed forms. However, despite its importance as an ancient script, optical character recognition research has not given enough attention to Ethiopic text recognition. In recent years, deep learning (DL) has emerged as a powerful technique for recognizing patterns. In this study, a DL approach is used to recognize historical Ethiopic handwritten texts. The recognition model employs an end-to-end strategy enabling sequential feature extraction and efficient recognition. An attention mechanism coupled with a connectionist temporal classification architecture is the core of this recognition model architecture. In addition, there are seven convolutional neural networks and two recurrent neural networks. We increase the training data using data augmentation techniques to address the data scarcity common in deep learning applications. The experiments include an original training dataset of 79,684 historical handwritten images and an augmented dataset of 10,000 images containing Ethiopic texts. The model used for recognition showed promising results. For “Test Set I” which had 6,150 samples, the character error rate (CER) was 17.95%, and for “Test Set II” which had 15,935 samples, the CER was 29.95%. These outcomes indicate that this approach has the potential to improve the recognition of historical handwritten Ethiopic text.https://ieeexplore.ieee.org/document/10247013/Deep learningend-to-end learningethiopic scripthandwritten text recognitionpattern recognition |
spellingShingle | Ruchika Malhotra Maru Tesfaye Addis End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning IEEE Access Deep learning end-to-end learning ethiopic script handwritten text recognition pattern recognition |
title | End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning |
title_full | End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning |
title_fullStr | End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning |
title_full_unstemmed | End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning |
title_short | End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning |
title_sort | end to end historical handwritten ethiopic text recognition using deep learning |
topic | Deep learning end-to-end learning ethiopic script handwritten text recognition pattern recognition |
url | https://ieeexplore.ieee.org/document/10247013/ |
work_keys_str_mv | AT ruchikamalhotra endtoendhistoricalhandwrittenethiopictextrecognitionusingdeeplearning AT marutesfayeaddis endtoendhistoricalhandwrittenethiopictextrecognitionusingdeeplearning |