STELA: A Real-Time Scene Text Detector With Learned Anchor
To achieve high coverage of target boxes, a normal strategy of conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks. In this work, we present a simple and intuitive method for multi-oriented text detection wher...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8877714/ |
_version_ | 1818349432054218752 |
---|---|
author | Linjie Deng Yanxiang Gong Xinchen Lu Yi Lin Zheng Ma Mei Xie |
author_facet | Linjie Deng Yanxiang Gong Xinchen Lu Yi Lin Zheng Ma Mei Xie |
author_sort | Linjie Deng |
collection | DOAJ |
description | To achieve high coverage of target boxes, a normal strategy of conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks. In this work, we present a simple and intuitive method for multi-oriented text detection where each location of feature maps only associates with one reference box. The idea is inspired from the two-stage R-CNN framework that can estimate the location of objects with any shape by using learned proposals. The aim of our method is to integrate this mechanism into a one-stage detector and employ the learned anchor which is obtained through a regression operation to replace the original one into the final predictions. Based on RetinaNet, our method achieves competitive performances on several public benchmarks with a totally realtime efficiency (26.5fps at 800p), which surpasses all of anchor-based scene text detectors. In addition, with less attention on anchor design, we believe our method is easy to be applied on other analogous detection tasks. The code is publicly available at https://github.com/xhzdeng/stela. |
first_indexed | 2024-12-13T18:05:51Z |
format | Article |
id | doaj.art-e5b9626a63904fb38437ca428b4488a4 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-13T18:05:51Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-e5b9626a63904fb38437ca428b4488a42022-12-21T23:36:04ZengIEEEIEEE Access2169-35362019-01-01715340015340710.1109/ACCESS.2019.29484058877714STELA: A Real-Time Scene Text Detector With Learned AnchorLinjie Deng0https://orcid.org/0000-0002-4921-8639Yanxiang Gong1Xinchen Lu2Yi Lin3https://orcid.org/0000-0002-7194-5023Zheng Ma4Mei Xie5School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, ChinaNational Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu, ChinaTo achieve high coverage of target boxes, a normal strategy of conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks. In this work, we present a simple and intuitive method for multi-oriented text detection where each location of feature maps only associates with one reference box. The idea is inspired from the two-stage R-CNN framework that can estimate the location of objects with any shape by using learned proposals. The aim of our method is to integrate this mechanism into a one-stage detector and employ the learned anchor which is obtained through a regression operation to replace the original one into the final predictions. Based on RetinaNet, our method achieves competitive performances on several public benchmarks with a totally realtime efficiency (26.5fps at 800p), which surpasses all of anchor-based scene text detectors. In addition, with less attention on anchor design, we believe our method is easy to be applied on other analogous detection tasks. The code is publicly available at https://github.com/xhzdeng/stela.https://ieeexplore.ieee.org/document/8877714/Scene text detectionreal-time detectorlearned anchor |
spellingShingle | Linjie Deng Yanxiang Gong Xinchen Lu Yi Lin Zheng Ma Mei Xie STELA: A Real-Time Scene Text Detector With Learned Anchor IEEE Access Scene text detection real-time detector learned anchor |
title | STELA: A Real-Time Scene Text Detector With Learned Anchor |
title_full | STELA: A Real-Time Scene Text Detector With Learned Anchor |
title_fullStr | STELA: A Real-Time Scene Text Detector With Learned Anchor |
title_full_unstemmed | STELA: A Real-Time Scene Text Detector With Learned Anchor |
title_short | STELA: A Real-Time Scene Text Detector With Learned Anchor |
title_sort | stela a real time scene text detector with learned anchor |
topic | Scene text detection real-time detector learned anchor |
url | https://ieeexplore.ieee.org/document/8877714/ |
work_keys_str_mv | AT linjiedeng stelaarealtimescenetextdetectorwithlearnedanchor AT yanxianggong stelaarealtimescenetextdetectorwithlearnedanchor AT xinchenlu stelaarealtimescenetextdetectorwithlearnedanchor AT yilin stelaarealtimescenetextdetectorwithlearnedanchor AT zhengma stelaarealtimescenetextdetectorwithlearnedanchor AT meixie stelaarealtimescenetextdetectorwithlearnedanchor |