Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity

Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse fonts to...

Full description

Bibliographic Details
Main Authors: Sanjana Gunna, Rohit Saluja, Cheerakkuzhi Veluthemana Jawahar
Format: Article
Language:English
Published: MDPI AG 2022-03-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/8/4/86
_version_ 1827619638593191936
author Sanjana Gunna
Rohit Saluja
Cheerakkuzhi Veluthemana Jawahar
author_facet Sanjana Gunna
Rohit Saluja
Cheerakkuzhi Veluthemana Jawahar
author_sort Sanjana Gunna
collection DOAJ
description Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse fonts to ensure robust reading solutions. We present utilizing additional non-Unicode fonts with generally employed Unicode fonts to cover font diversity in such synthesizers for Indian languages. We also perform experiments on transfer learning among six different Indian languages. Our transfer learning experiments on synthetic images with common backgrounds provide an exciting insight that Indian scripts can benefit from each other than from the extensive English datasets. Our evaluations for the real settings help us achieve significant improvements over previous methods on four Indian languages from standard datasets like IIIT-ILST, MLT-17, and the new dataset (we release) containing 440 scene images with 500 Gujarati and 2535 Tamil words. Further enriching the synthetic dataset with non-Unicode fonts and multiple augmentations helps us achieve a remarkable Word Recognition Rate gain of over <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>33</mn><mo>%</mo></mrow></semantics></math></inline-formula> on the IIIT-ILST Hindi dataset. We also present the results of lexicon-based transcription approaches for all six languages.
first_indexed 2024-03-09T10:32:54Z
format Article
id doaj.art-45989a9a86664e9988278ad7dc0d8e07
institution Directory Open Access Journal
issn 2313-433X
language English
last_indexed 2024-03-09T10:32:54Z
publishDate 2022-03-01
publisher MDPI AG
record_format Article
series Journal of Imaging
spelling doaj.art-45989a9a86664e9988278ad7dc0d8e072023-12-01T21:07:32ZengMDPI AGJournal of Imaging2313-433X2022-03-01848610.3390/jimaging8040086Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font DiversitySanjana Gunna0Rohit Saluja1Cheerakkuzhi Veluthemana Jawahar2Centre for Vision Information Technology, International Institute of Information Technology, Hyderabad 500032, IndiaCentre for Vision Information Technology, International Institute of Information Technology, Hyderabad 500032, IndiaCentre for Vision Information Technology, International Institute of Information Technology, Hyderabad 500032, IndiaReading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse fonts to ensure robust reading solutions. We present utilizing additional non-Unicode fonts with generally employed Unicode fonts to cover font diversity in such synthesizers for Indian languages. We also perform experiments on transfer learning among six different Indian languages. Our transfer learning experiments on synthetic images with common backgrounds provide an exciting insight that Indian scripts can benefit from each other than from the extensive English datasets. Our evaluations for the real settings help us achieve significant improvements over previous methods on four Indian languages from standard datasets like IIIT-ILST, MLT-17, and the new dataset (we release) containing 440 scene images with 500 Gujarati and 2535 Tamil words. Further enriching the synthetic dataset with non-Unicode fonts and multiple augmentations helps us achieve a remarkable Word Recognition Rate gain of over <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>33</mn><mo>%</mo></mrow></semantics></math></inline-formula> on the IIIT-ILST Hindi dataset. We also present the results of lexicon-based transcription approaches for all six languages.https://www.mdpi.com/2313-433X/8/4/86scene text recognitiontransfer learningphoto OCRmulti-lingual OCRIndian languagesindic OCR
spellingShingle Sanjana Gunna
Rohit Saluja
Cheerakkuzhi Veluthemana Jawahar
Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
Journal of Imaging
scene text recognition
transfer learning
photo OCR
multi-lingual OCR
Indian languages
indic OCR
title Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_full Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_fullStr Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_full_unstemmed Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_short Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_sort improving scene text recognition for indian languages with transfer learning and font diversity
topic scene text recognition
transfer learning
photo OCR
multi-lingual OCR
Indian languages
indic OCR
url https://www.mdpi.com/2313-433X/8/4/86
work_keys_str_mv AT sanjanagunna improvingscenetextrecognitionforindianlanguageswithtransferlearningandfontdiversity
AT rohitsaluja improvingscenetextrecognitionforindianlanguageswithtransferlearningandfontdiversity
AT cheerakkuzhiveluthemanajawahar improvingscenetextrecognitionforindianlanguageswithtransferlearningandfontdiversity