Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network

Scene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box. These methods are complex in model design and time-consuming to train. In this paper, we propose a new Fully Convolutio...

Full description

Bibliographic Details
Main Authors: Dongping Cao, Jiachen Dang, Yong Zhong
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/13/3/486
_version_ 1797541129048031232
author Dongping Cao
Jiachen Dang
Yong Zhong
author_facet Dongping Cao
Jiachen Dang
Yong Zhong
author_sort Dongping Cao
collection DOAJ
description Scene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box. These methods are complex in model design and time-consuming to train. In this paper, we propose a new Fully Convolutional One-Stage Object Detection (FCOS)-based text detection method that can robustly detect multioriented and multilingual text from natural scene images in a per pixel prediction approach. Our proposed text detector employs an anchor-free approach, unlike state-of-the-art text detectors that do not rely on a predefined anchor box. In order to enhance the feature representation ability of FCOS for text detection tasks, we apply the Bidirectional Feature Pyramid Network (BiFPN) as the backbone network, enhancing the model learning capacity and increasing the receptive field. We demonstrate the superior performance of our method on multioriented (ICDAR-2015, ICDAR-2017 MLT) and horizontal (ICDAR-2013) text detection benchmark tasks. Moreover, our method has an f-measure of 88.65 and 86.32 for the benchmark datasets ICDAR 2013 and ICDAR 2015, respectively, and 80.75 for the ICDAR-2017 MLT dataset.
first_indexed 2024-03-10T13:10:59Z
format Article
id doaj.art-99bfa921bf2140a6b5e7018d04170bd5
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-10T13:10:59Z
publishDate 2021-03-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-99bfa921bf2140a6b5e7018d04170bd52023-11-21T10:46:14ZengMDPI AGSymmetry2073-89942021-03-0113348610.3390/sym13030486Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid NetworkDongping Cao0Jiachen Dang1Yong Zhong2Chengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu 610041, ChinaChengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu 610041, ChinaChengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu 610041, ChinaScene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box. These methods are complex in model design and time-consuming to train. In this paper, we propose a new Fully Convolutional One-Stage Object Detection (FCOS)-based text detection method that can robustly detect multioriented and multilingual text from natural scene images in a per pixel prediction approach. Our proposed text detector employs an anchor-free approach, unlike state-of-the-art text detectors that do not rely on a predefined anchor box. In order to enhance the feature representation ability of FCOS for text detection tasks, we apply the Bidirectional Feature Pyramid Network (BiFPN) as the backbone network, enhancing the model learning capacity and increasing the receptive field. We demonstrate the superior performance of our method on multioriented (ICDAR-2015, ICDAR-2017 MLT) and horizontal (ICDAR-2013) text detection benchmark tasks. Moreover, our method has an f-measure of 88.65 and 86.32 for the benchmark datasets ICDAR 2013 and ICDAR 2015, respectively, and 80.75 for the ICDAR-2017 MLT dataset.https://www.mdpi.com/2073-8994/13/3/486scene text detectionmultioriented textconvolutional neural networks
spellingShingle Dongping Cao
Jiachen Dang
Yong Zhong
Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
Symmetry
scene text detection
multioriented text
convolutional neural networks
title Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
title_full Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
title_fullStr Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
title_full_unstemmed Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
title_short Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
title_sort towards accurate scene text detection with bidirectional feature pyramid network
topic scene text detection
multioriented text
convolutional neural networks
url https://www.mdpi.com/2073-8994/13/3/486
work_keys_str_mv AT dongpingcao towardsaccuratescenetextdetectionwithbidirectionalfeaturepyramidnetwork
AT jiachendang towardsaccuratescenetextdetectionwithbidirectionalfeaturepyramidnetwork
AT yongzhong towardsaccuratescenetextdetectionwithbidirectionalfeaturepyramidnetwork