Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images

According to statistics, there are 422 million speakers of the Arabic language. Islam is the second-largest religion in the world, and its followers constitute approximately 25% of the world’s population. Since the Holy Quran is in Arabic, nearly all Muslims understand the Arabic language per some a...

Full description

Bibliographic Details
Main Authors: Hanan Butt, Muhammad Raheel Raza, Muhammad Javed Ramzan, Muhammad Junaid Ali, Muhammad Haris
Format: Article
Language:English
Published: MDPI AG 2021-07-01
Series:Forecasting
Subjects:
Online Access:https://www.mdpi.com/2571-9394/3/3/33
_version_ 1797519279071952896
author Hanan Butt
Muhammad Raheel Raza
Muhammad Javed Ramzan
Muhammad Junaid Ali
Muhammad Haris
author_facet Hanan Butt
Muhammad Raheel Raza
Muhammad Javed Ramzan
Muhammad Junaid Ali
Muhammad Haris
author_sort Hanan Butt
collection DOAJ
description According to statistics, there are 422 million speakers of the Arabic language. Islam is the second-largest religion in the world, and its followers constitute approximately 25% of the world’s population. Since the Holy Quran is in Arabic, nearly all Muslims understand the Arabic language per some analytical information. Many countries have Arabic as their native and official language as well. In recent years, the number of internet users speaking the Arabic language has been increased, but there is very little work on it due to some complications. It is challenging to build a robust recognition system (RS) for cursive nature languages such as Arabic. These challenges become more complex if there are variations in text size, fonts, colors, orientation, lighting conditions, noise within a dataset, etc. To deal with them, deep learning models show noticeable results on data modeling and can handle large datasets. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) can select good features and follow the sequential data learning technique. These two neural networks offer impressive results in many research areas such as text recognition, voice recognition, several tasks of Natural Language Processing (NLP), and others. This paper presents a CNN-RNN model with an attention mechanism for Arabic image text recognition. The model takes an input image and generates feature sequences through a CNN. These sequences are transferred to a bidirectional RNN to obtain feature sequences in order. The bidirectional RNN can miss some preprocessing of text segmentation. Therefore, a bidirectional RNN with an attention mechanism is used to generate output, enabling the model to select relevant information from the feature sequences. An attention mechanism implements end-to-end training through a standard backpropagation algorithm.
first_indexed 2024-03-10T07:40:39Z
format Article
id doaj.art-6cbfa48e9cd145d1aae0bf7c1bfb1204
institution Directory Open Access Journal
issn 2571-9394
language English
last_indexed 2024-03-10T07:40:39Z
publishDate 2021-07-01
publisher MDPI AG
record_format Article
series Forecasting
spelling doaj.art-6cbfa48e9cd145d1aae0bf7c1bfb12042023-11-22T13:06:29ZengMDPI AGForecasting2571-93942021-07-013352054010.3390/forecast3030033Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene ImagesHanan Butt0Muhammad Raheel Raza1Muhammad Javed Ramzan2Muhammad Junaid Ali3Muhammad Haris4Department of Computer Science, COMSATS University Islamabad, Islamabad 45550, PakistanDepartment of Software Engineering, College of Technology, Firat University, 23000 Elazig, TurkeyDepartment of Computer Science, COMSATS University Islamabad, Islamabad 45550, PakistanDepartment of Computer Science, COMSATS University Islamabad, Islamabad 45550, PakistanDepartment of Computer Science, COMSATS University Islamabad, Islamabad 45550, PakistanAccording to statistics, there are 422 million speakers of the Arabic language. Islam is the second-largest religion in the world, and its followers constitute approximately 25% of the world’s population. Since the Holy Quran is in Arabic, nearly all Muslims understand the Arabic language per some analytical information. Many countries have Arabic as their native and official language as well. In recent years, the number of internet users speaking the Arabic language has been increased, but there is very little work on it due to some complications. It is challenging to build a robust recognition system (RS) for cursive nature languages such as Arabic. These challenges become more complex if there are variations in text size, fonts, colors, orientation, lighting conditions, noise within a dataset, etc. To deal with them, deep learning models show noticeable results on data modeling and can handle large datasets. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) can select good features and follow the sequential data learning technique. These two neural networks offer impressive results in many research areas such as text recognition, voice recognition, several tasks of Natural Language Processing (NLP), and others. This paper presents a CNN-RNN model with an attention mechanism for Arabic image text recognition. The model takes an input image and generates feature sequences through a CNN. These sequences are transferred to a bidirectional RNN to obtain feature sequences in order. The bidirectional RNN can miss some preprocessing of text segmentation. Therefore, a bidirectional RNN with an attention mechanism is used to generate output, enabling the model to select relevant information from the feature sequences. An attention mechanism implements end-to-end training through a standard backpropagation algorithm.https://www.mdpi.com/2571-9394/3/3/33image text recognitiondeep learningrecurrent neural networks (RNNs)convolutional neural networks (CNNs)bidirectional RNNattention mechanism
spellingShingle Hanan Butt
Muhammad Raheel Raza
Muhammad Javed Ramzan
Muhammad Junaid Ali
Muhammad Haris
Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
Forecasting
image text recognition
deep learning
recurrent neural networks (RNNs)
convolutional neural networks (CNNs)
bidirectional RNN
attention mechanism
title Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
title_full Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
title_fullStr Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
title_full_unstemmed Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
title_short Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
title_sort attention based cnn rnn arabic text recognition from natural scene images
topic image text recognition
deep learning
recurrent neural networks (RNNs)
convolutional neural networks (CNNs)
bidirectional RNN
attention mechanism
url https://www.mdpi.com/2571-9394/3/3/33
work_keys_str_mv AT hananbutt attentionbasedcnnrnnarabictextrecognitionfromnaturalsceneimages
AT muhammadraheelraza attentionbasedcnnrnnarabictextrecognitionfromnaturalsceneimages
AT muhammadjavedramzan attentionbasedcnnrnnarabictextrecognitionfromnaturalsceneimages
AT muhammadjunaidali attentionbasedcnnrnnarabictextrecognitionfromnaturalsceneimages
AT muhammadharis attentionbasedcnnrnnarabictextrecognitionfromnaturalsceneimages