Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network
Scene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box. These methods are complex in model design and time-consuming to train. In this paper, we propose a new Fully Convolutio...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-03-01
|
Series: | Symmetry |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-8994/13/3/486 |
_version_ | 1797541129048031232 |
---|---|
author | Dongping Cao Jiachen Dang Yong Zhong |
author_facet | Dongping Cao Jiachen Dang Yong Zhong |
author_sort | Dongping Cao |
collection | DOAJ |
description | Scene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box. These methods are complex in model design and time-consuming to train. In this paper, we propose a new Fully Convolutional One-Stage Object Detection (FCOS)-based text detection method that can robustly detect multioriented and multilingual text from natural scene images in a per pixel prediction approach. Our proposed text detector employs an anchor-free approach, unlike state-of-the-art text detectors that do not rely on a predefined anchor box. In order to enhance the feature representation ability of FCOS for text detection tasks, we apply the Bidirectional Feature Pyramid Network (BiFPN) as the backbone network, enhancing the model learning capacity and increasing the receptive field. We demonstrate the superior performance of our method on multioriented (ICDAR-2015, ICDAR-2017 MLT) and horizontal (ICDAR-2013) text detection benchmark tasks. Moreover, our method has an f-measure of 88.65 and 86.32 for the benchmark datasets ICDAR 2013 and ICDAR 2015, respectively, and 80.75 for the ICDAR-2017 MLT dataset. |
first_indexed | 2024-03-10T13:10:59Z |
format | Article |
id | doaj.art-99bfa921bf2140a6b5e7018d04170bd5 |
institution | Directory Open Access Journal |
issn | 2073-8994 |
language | English |
last_indexed | 2024-03-10T13:10:59Z |
publishDate | 2021-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Symmetry |
spelling | doaj.art-99bfa921bf2140a6b5e7018d04170bd52023-11-21T10:46:14ZengMDPI AGSymmetry2073-89942021-03-0113348610.3390/sym13030486Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid NetworkDongping Cao0Jiachen Dang1Yong Zhong2Chengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu 610041, ChinaChengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu 610041, ChinaChengdu Institute of Computer Applications, Chinese Academy of Sciences, Chengdu 610041, ChinaScene text detection, this task of detecting text from real images, is a hot research topic in the machine vision community. Most of the current research is based on an anchor box. These methods are complex in model design and time-consuming to train. In this paper, we propose a new Fully Convolutional One-Stage Object Detection (FCOS)-based text detection method that can robustly detect multioriented and multilingual text from natural scene images in a per pixel prediction approach. Our proposed text detector employs an anchor-free approach, unlike state-of-the-art text detectors that do not rely on a predefined anchor box. In order to enhance the feature representation ability of FCOS for text detection tasks, we apply the Bidirectional Feature Pyramid Network (BiFPN) as the backbone network, enhancing the model learning capacity and increasing the receptive field. We demonstrate the superior performance of our method on multioriented (ICDAR-2015, ICDAR-2017 MLT) and horizontal (ICDAR-2013) text detection benchmark tasks. Moreover, our method has an f-measure of 88.65 and 86.32 for the benchmark datasets ICDAR 2013 and ICDAR 2015, respectively, and 80.75 for the ICDAR-2017 MLT dataset.https://www.mdpi.com/2073-8994/13/3/486scene text detectionmultioriented textconvolutional neural networks |
spellingShingle | Dongping Cao Jiachen Dang Yong Zhong Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network Symmetry scene text detection multioriented text convolutional neural networks |
title | Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network |
title_full | Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network |
title_fullStr | Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network |
title_full_unstemmed | Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network |
title_short | Towards Accurate Scene Text Detection with Bidirectional Feature Pyramid Network |
title_sort | towards accurate scene text detection with bidirectional feature pyramid network |
topic | scene text detection multioriented text convolutional neural networks |
url | https://www.mdpi.com/2073-8994/13/3/486 |
work_keys_str_mv | AT dongpingcao towardsaccuratescenetextdetectionwithbidirectionalfeaturepyramidnetwork AT jiachendang towardsaccuratescenetextdetectionwithbidirectionalfeaturepyramidnetwork AT yongzhong towardsaccuratescenetextdetectionwithbidirectionalfeaturepyramidnetwork |