Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuit...

Full description

Bibliographic Details
Main Authors:	Zohreh Khosrobeigi, Hadi Veisi, Ehsan Hoseinzade, Hanieh Shabanian
Format:	Article
Language:	English
Published:	MDPI AG 2022-11-01
Series:	Applied Sciences
Subjects:	Optical Character Recognition (OCR) Long Short-Term Memory (LSTM) Bidirectional LSTM (BLSTM) Convolution Neural Network (CNN) Persian language
Online Access:	https://www.mdpi.com/2076-3417/12/22/11760

_version_	1797465904094642176
author	Zohreh Khosrobeigi Hadi Veisi Ehsan Hoseinzade Hanieh Shabanian
author_facet	Zohreh Khosrobeigi Hadi Veisi Ehsan Hoseinzade Hanieh Shabanian
author_sort	Zohreh Khosrobeigi
collection	DOAJ
description	Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our proposed framework, Bina, is designed in a special way to address the issue of continuity by utilizing Convolution Neural Network (CNN) and deep bidirectional Long-Short Term Memory (BLSTM), a type of LSTM networks that has access to both past and future context. A huge and diverse dataset, including about 2M samples of both Persian and English contexts,consisting of various fonts and sizes, is also generated to train and test the performance of the proposed model. Various configurations are tested to find the optimal structure of CNN and BLSTM. The results show that Bina successfully outperformed state of the art baseline algorithm by achieving about 96% accuracy in the Persian and 88% accuracy in the Persian and English contexts.
first_indexed	2024-03-09T18:29:06Z
format	Article
id	doaj.art-30b5f8998b6e41b3857760fd50e76722
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-09T18:29:06Z
publishDate	2022-11-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-30b5f8998b6e41b3857760fd50e767222023-11-24T07:40:30ZengMDPI AGApplied Sciences2076-34172022-11-0112221176010.3390/app122211760Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term MemoryZohreh Khosrobeigi0Hadi Veisi1Ehsan Hoseinzade2Hanieh Shabanian3School of Computer Science and Statistics, Trinity College Dublin, D02 YY50 Dublin, IrelandFaculty of New Sciences and Technologies, University of Tehran, Tehran P.O. Box 14399-56191, IranSchool of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, CanadaComputer Science Department, School of Computing and Analytics, Northern Kentucky University, Highland Heights, KY 41076, USAOptical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our proposed framework, Bina, is designed in a special way to address the issue of continuity by utilizing Convolution Neural Network (CNN) and deep bidirectional Long-Short Term Memory (BLSTM), a type of LSTM networks that has access to both past and future context. A huge and diverse dataset, including about 2M samples of both Persian and English contexts,consisting of various fonts and sizes, is also generated to train and test the performance of the proposed model. Various configurations are tested to find the optimal structure of CNN and BLSTM. The results show that Bina successfully outperformed state of the art baseline algorithm by achieving about 96% accuracy in the Persian and 88% accuracy in the Persian and English contexts.https://www.mdpi.com/2076-3417/12/22/11760Optical Character Recognition (OCR)Long Short-Term Memory (LSTM)Bidirectional LSTM (BLSTM)Convolution Neural Network (CNN)Persian language
spellingShingle	Zohreh Khosrobeigi Hadi Veisi Ehsan Hoseinzade Hanieh Shabanian Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory Applied Sciences Optical Character Recognition (OCR) Long Short-Term Memory (LSTM) Bidirectional LSTM (BLSTM) Convolution Neural Network (CNN) Persian language
title	Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_full	Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_fullStr	Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_full_unstemmed	Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_short	Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory
title_sort	persian optical character recognition using deep bidirectional long short term memory
topic	Optical Character Recognition (OCR) Long Short-Term Memory (LSTM) Bidirectional LSTM (BLSTM) Convolution Neural Network (CNN) Persian language
url	https://www.mdpi.com/2076-3417/12/22/11760
work_keys_str_mv	AT zohrehkhosrobeigi persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory AT hadiveisi persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory AT ehsanhoseinzade persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory AT haniehshabanian persianopticalcharacterrecognitionusingdeepbidirectionallongshorttermmemory

Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

Similar Items