A Fast Scene Text Detector Using Knowledge Distillation

Incidental scene text detection is a challenging problem because of arbitrary orientation, low resolution, perspective distortion, and variant aspect ratios of text in natural images. In this paper, we present an end-to-end trainable deep model, which can effectively and efficiently locate multi-ori...

Full description

Bibliographic Details
Main Authors: Peng Yang, Fanlong Zhang, Guowei Yang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8626192/
Description
Summary:Incidental scene text detection is a challenging problem because of arbitrary orientation, low resolution, perspective distortion, and variant aspect ratios of text in natural images. In this paper, we present an end-to-end trainable deep model, which can effectively and efficiently locate multi-oriented scene text. Our detector includes a student network and a teacher network, and they inherit complex VGGNet and lightweight PVANet architecture, respectively. While deploying for text detection, the teacher network is used to guide the training process of a student via knowledge distilling so as to maintain the tradeoff between accuracy and efficiency. We have evaluated the proposed detector on three popular benchmarks, and it achieves F-measures of 83.7%, 57.27%, and 90% on ICDAR2015 Incidental Scene Text, COCO-Text, and ICDAR2013, respectively, which outperforms the most state-of-the-art methods.
ISSN:2169-3536