Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video
This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non‐trivial task du...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2018-08-01
|
Series: | IET Computer Vision |
Subjects: | |
Online Access: | https://doi.org/10.1049/iet-cvi.2017.0468 |
_version_ | 1797684822066331648 |
---|---|
author | Oussama Zayene Sameh Masmoudi Touj Jean Hennebert Rolf Ingold Najoua Essoukri Ben Amara |
author_facet | Oussama Zayene Sameh Masmoudi Touj Jean Hennebert Rolf Ingold Najoua Essoukri Ben Amara |
author_sort | Oussama Zayene |
collection | DOAJ |
description | This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non‐trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non‐uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation‐free method that relies specifically on a multi‐dimensional long short‐term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre‐processing step and a compact representation of Arabic character models brings robust performance and yields a low‐error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV‐R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state‐of‐the‐art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels. |
first_indexed | 2024-03-12T00:35:17Z |
format | Article |
id | doaj.art-9015363f8abf4b7f86c6536de31da16e |
institution | Directory Open Access Journal |
issn | 1751-9632 1751-9640 |
language | English |
last_indexed | 2024-03-12T00:35:17Z |
publishDate | 2018-08-01 |
publisher | Wiley |
record_format | Article |
series | IET Computer Vision |
spelling | doaj.art-9015363f8abf4b7f86c6536de31da16e2023-09-15T09:48:11ZengWileyIET Computer Vision1751-96321751-96402018-08-0112571071910.1049/iet-cvi.2017.0468Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news videoOussama Zayene0Sameh Masmoudi Touj1Jean Hennebert2Rolf Ingold3Najoua Essoukri Ben Amara4LATIS Laboratory, National Engineering School of Sousse, University of SousseTunisiaLATIS Laboratory, National Engineering School of Sousse, University of SousseTunisiaInstitute of Complex Systems, HES‐SO, University of Applied Science Western SwitzerlandSwitzerlandDIVA Group, University of FribourgSwitzerlandLATIS Laboratory, National Engineering School of Sousse, University of SousseTunisiaThis study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non‐trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non‐uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation‐free method that relies specifically on a multi‐dimensional long short‐term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre‐processing step and a compact representation of Arabic character models brings robust performance and yields a low‐error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV‐R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state‐of‐the‐art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.https://doi.org/10.1049/iet-cvi.2017.0468multidimensional long short-term memory networksartificial Arabic video text recognitionnews videorecurrent neural networksembedded textsmultimedia document annotation |
spellingShingle | Oussama Zayene Sameh Masmoudi Touj Jean Hennebert Rolf Ingold Najoua Essoukri Ben Amara Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video IET Computer Vision multidimensional long short-term memory networks artificial Arabic video text recognition news video recurrent neural networks embedded texts multimedia document annotation |
title | Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video |
title_full | Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video |
title_fullStr | Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video |
title_full_unstemmed | Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video |
title_short | Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video |
title_sort | multi dimensional long short term memory networks for artificial arabic text recognition in news video |
topic | multidimensional long short-term memory networks artificial Arabic video text recognition news video recurrent neural networks embedded texts multimedia document annotation |
url | https://doi.org/10.1049/iet-cvi.2017.0468 |
work_keys_str_mv | AT oussamazayene multidimensionallongshorttermmemorynetworksforartificialarabictextrecognitioninnewsvideo AT samehmasmouditouj multidimensionallongshorttermmemorynetworksforartificialarabictextrecognitioninnewsvideo AT jeanhennebert multidimensionallongshorttermmemorynetworksforartificialarabictextrecognitioninnewsvideo AT rolfingold multidimensionallongshorttermmemorynetworksforartificialarabictextrecognitioninnewsvideo AT najouaessoukribenamara multidimensionallongshorttermmemorynetworksforartificialarabictextrecognitioninnewsvideo |