Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuit...

Full description

Bibliographic Details
Main Authors: Zohreh Khosrobeigi, Hadi Veisi, Ehsan Hoseinzade, Hanieh Shabanian
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/22/11760
_version_ 1797465904094642176
author Zohreh Khosrobeigi
Hadi Veisi
Ehsan Hoseinzade
Hanieh Shabanian
author_facet Zohreh Khosrobeigi
Hadi Veisi
Ehsan Hoseinzade
Hanieh Shabanian
author_sort Zohreh Khosrobeigi
collection DOAJ
description Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our proposed framework, Bina, is designed in a special way to address the issue of continuity by utilizing Convolution Neural Network (CNN) and deep bidirectional Long-Short Term Memory (BLSTM), a type of LSTM networks that has access to both past and future context. A huge and diverse dataset, including about 2M samples of both Persian and English contexts,consisting of various fonts and sizes, is also generated to train and test the performance of the proposed model. Various configurations are tested to find the optimal structure of CNN and BLSTM. The results show that Bina successfully outperformed state of the art baseline algorithm by achieving about 96% accuracy in the Persian and 88% accuracy in the Persian and English contexts.
first_indexed 2024-03-09T18:29:06Z
format Article
id doaj.art-30b5f8998b6e41b3857760fd50e76722
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T18:29:06Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-30b5f8998b6e41b3857760fd50e767222023-11-24T07:40:30ZengMDPI AGApplied Sciences2076-34172022-11-0112221176010.3390/app122211760Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term MemoryZohreh Khosrobeigi0Hadi Veisi1Ehsan Hoseinzade2Hanieh Shabanian3School of Computer Science and Statistics, Trinity College Dublin, D02 YY50 Dublin, IrelandFaculty of New Sciences and Technologies, University of Tehran, Tehran P.O. Box 14399-56191, IranSchool of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, CanadaComputer Science Department, School of Computing and Analytics, Northern Kentucky University, Highland Heights, KY 41076, USAOptical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our proposed framework, Bina, is designed in a special way to address the issue of continuity by utilizing Convolution Neural Network (CNN) and deep bidirectional Long-Short Term Memory (BLSTM), a type of LSTM networks that has access to both past and future context. A huge and diverse dataset, including about 2M samples of both Persian and English contexts,consisting of various fonts and sizes, is also generated to train and test the performance of the proposed model. Various configurations are tested to find the optimal structure of CNN and BLSTM. The results show that Bina successfully outperformed state of the art baseline algorithm by achieving about 96% accuracy in the Persian and 88% accuracy in the Persian and English contexts.https://www.mdpi.com/2076-3417/12/22/11760Optical Character Recognition (OCR)Long Short-Term Memory (LSTM)Bidirectional LSTM (BLSTM)Convolution Neural Network (CNN)Persian language
spellingShingle Zohreh Khosrobeigi
Hadi Veisi
Ehsan Hoseinzade
Hanieh Shabanian
Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
Applied Sciences
Optical Character Recognition (OCR)
Long Short-Term Memory (LSTM)
Bidirectional LSTM (BLSTM)
Convolution Neural Network (CNN)
Persian language
title Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_full Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_fullStr Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_full_unstemmed Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_short Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_sort persian optical character recognition using deep bidirectional long short term memory
topic Optical Character Recognition (OCR)
Long Short-Term Memory (LSTM)
Bidirectional LSTM (BLSTM)
Convolution Neural Network (CNN)
Persian language
url https://www.mdpi.com/2076-3417/12/22/11760
work_keys_str_mv AT zohrehkhosrobeigi persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory
AT hadiveisi persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory
AT ehsanhoseinzade persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory
AT haniehshabanian persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory