Scene text recognition by learning co‐occurrence of strokes based on spatiality embedded dictionary

Text information contained in scene images is very helpful for high‐level image understanding. In this study, the authors propose to learn co‐occurrence of local strokes for scene text recognition by using a spatiality embedded dictionary (SED). Unlike spatial pyramid partitioning images into grids...

Full description

Bibliographic Details
Main Authors: Song Gao, Chunheng Wang, Baihua Xiao, Cunzhao Shi, Wen Zhou, Zhong Zhang
Format: Article
Language:English
Published: Wiley 2015-02-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/iet-cvi.2014.0022
Description
Summary:Text information contained in scene images is very helpful for high‐level image understanding. In this study, the authors propose to learn co‐occurrence of local strokes for scene text recognition by using a spatiality embedded dictionary (SED). Unlike spatial pyramid partitioning images into grids to incorporate spatial information, the authors SED associates every codeword with a particular response region and introduces more precise spatial information for robust character recognition. After localised soft coding and max pooling of the first layer, a sparse dictionary is learned to model co‐occurrence of several local strokes, which further improves classification performance. Experimental results on two scene character recognition datasets ICDAR2003 and CHARS74 K demonstrate that their character recognition method outperforms state‐of‐the‐art methods. Besides, competitive word recognition results are also reported for four benchmark word recognition datasets ICDAR2003, ICDAR2011, ICDAR2013 and street view text when combining their character recognition method with a conditional random field language model.
ISSN:1751-9632
1751-9640