TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
Text detection has been significantly boosted by the development of deep neural networks but most existing methods focus on a single kind of text instance (i.e., overlaid text, layered text, scene text). In this paper, we expand the text detection task from a single dimension to multiple dimensions,...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-12-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/12/1/159 |
_version_ | 1797625959047757824 |
---|---|
author | Fenfen Zhou Yuanqiang Cai Yingjie Tian |
author_facet | Fenfen Zhou Yuanqiang Cai Yingjie Tian |
author_sort | Fenfen Zhou |
collection | DOAJ |
description | Text detection has been significantly boosted by the development of deep neural networks but most existing methods focus on a single kind of text instance (i.e., overlaid text, layered text, scene text). In this paper, we expand the text detection task from a single dimension to multiple dimensions, thus providing multi-type text descriptions for the scene and content analysis of videos. Specifically, we establish a new task to detect and classify text instances simultaneously, termed TextDC. As far as we know, existing benchmarks cannot meet the requirements of the proposed task. To this end, we collect a large-scale text detection and classification dataset, named Text3C, which is annotated using multilingual labels, location information, and text categories. Together with the collected dataset, we introduce a multi-stage and strict evaluation metric, which penalizes detection approaches for missing text instances, false positive detection, inaccurate location boxes, and error text categories, developing a new benchmark for the proposed TextDC task. In addition, we extend several state-of-the-art detectors by modifying the prediction head to solve the new task. Then, a generalized text detection and classification framework is designed and formulated. Extensive experiments using the updated methods are conducted on the established benchmark to verify the solvability of the proposed task, the challenges of the dataset, and the effectiveness of the solution. |
first_indexed | 2024-03-11T10:03:51Z |
format | Article |
id | doaj.art-5b7f8a5692cc4a589fb71238ca22ac51 |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-11T10:03:51Z |
publishDate | 2022-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-5b7f8a5692cc4a589fb71238ca22ac512023-11-16T15:11:55ZengMDPI AGElectronics2079-92922022-12-0112115910.3390/electronics12010159TextDC: Exploring Multidimensional Text Detection via a New Benchmark and SolutionFenfen Zhou0Yuanqiang Cai1Yingjie Tian2School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, ChinaSchool of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, ChinaText detection has been significantly boosted by the development of deep neural networks but most existing methods focus on a single kind of text instance (i.e., overlaid text, layered text, scene text). In this paper, we expand the text detection task from a single dimension to multiple dimensions, thus providing multi-type text descriptions for the scene and content analysis of videos. Specifically, we establish a new task to detect and classify text instances simultaneously, termed TextDC. As far as we know, existing benchmarks cannot meet the requirements of the proposed task. To this end, we collect a large-scale text detection and classification dataset, named Text3C, which is annotated using multilingual labels, location information, and text categories. Together with the collected dataset, we introduce a multi-stage and strict evaluation metric, which penalizes detection approaches for missing text instances, false positive detection, inaccurate location boxes, and error text categories, developing a new benchmark for the proposed TextDC task. In addition, we extend several state-of-the-art detectors by modifying the prediction head to solve the new task. Then, a generalized text detection and classification framework is designed and formulated. Extensive experiments using the updated methods are conducted on the established benchmark to verify the solvability of the proposed task, the challenges of the dataset, and the effectiveness of the solution.https://www.mdpi.com/2079-9292/12/1/159text detection and classificationmultiple typesText3C dataset |
spellingShingle | Fenfen Zhou Yuanqiang Cai Yingjie Tian TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution Electronics text detection and classification multiple types Text3C dataset |
title | TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution |
title_full | TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution |
title_fullStr | TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution |
title_full_unstemmed | TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution |
title_short | TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution |
title_sort | textdc exploring multidimensional text detection via a new benchmark and solution |
topic | text detection and classification multiple types Text3C dataset |
url | https://www.mdpi.com/2079-9292/12/1/159 |
work_keys_str_mv | AT fenfenzhou textdcexploringmultidimensionaltextdetectionviaanewbenchmarkandsolution AT yuanqiangcai textdcexploringmultidimensionaltextdetectionviaanewbenchmarkandsolution AT yingjietian textdcexploringmultidimensionaltextdetectionviaanewbenchmarkandsolution |