TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution

Text detection has been significantly boosted by the development of deep neural networks but most existing methods focus on a single kind of text instance (i.e., overlaid text, layered text, scene text). In this paper, we expand the text detection task from a single dimension to multiple dimensions,...

Full description

Bibliographic Details
Main Authors: Fenfen Zhou, Yuanqiang Cai, Yingjie Tian
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/1/159
_version_ 1797625959047757824
author Fenfen Zhou
Yuanqiang Cai
Yingjie Tian
author_facet Fenfen Zhou
Yuanqiang Cai
Yingjie Tian
author_sort Fenfen Zhou
collection DOAJ
description Text detection has been significantly boosted by the development of deep neural networks but most existing methods focus on a single kind of text instance (i.e., overlaid text, layered text, scene text). In this paper, we expand the text detection task from a single dimension to multiple dimensions, thus providing multi-type text descriptions for the scene and content analysis of videos. Specifically, we establish a new task to detect and classify text instances simultaneously, termed TextDC. As far as we know, existing benchmarks cannot meet the requirements of the proposed task. To this end, we collect a large-scale text detection and classification dataset, named Text3C, which is annotated using multilingual labels, location information, and text categories. Together with the collected dataset, we introduce a multi-stage and strict evaluation metric, which penalizes detection approaches for missing text instances, false positive detection, inaccurate location boxes, and error text categories, developing a new benchmark for the proposed TextDC task. In addition, we extend several state-of-the-art detectors by modifying the prediction head to solve the new task. Then, a generalized text detection and classification framework is designed and formulated. Extensive experiments using the updated methods are conducted on the established benchmark to verify the solvability of the proposed task, the challenges of the dataset, and the effectiveness of the solution.
first_indexed 2024-03-11T10:03:51Z
format Article
id doaj.art-5b7f8a5692cc4a589fb71238ca22ac51
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-11T10:03:51Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-5b7f8a5692cc4a589fb71238ca22ac512023-11-16T15:11:55ZengMDPI AGElectronics2079-92922022-12-0112115910.3390/electronics12010159TextDC: Exploring Multidimensional Text Detection via a New Benchmark and SolutionFenfen Zhou0Yuanqiang Cai1Yingjie Tian2School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, ChinaSchool of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, ChinaText detection has been significantly boosted by the development of deep neural networks but most existing methods focus on a single kind of text instance (i.e., overlaid text, layered text, scene text). In this paper, we expand the text detection task from a single dimension to multiple dimensions, thus providing multi-type text descriptions for the scene and content analysis of videos. Specifically, we establish a new task to detect and classify text instances simultaneously, termed TextDC. As far as we know, existing benchmarks cannot meet the requirements of the proposed task. To this end, we collect a large-scale text detection and classification dataset, named Text3C, which is annotated using multilingual labels, location information, and text categories. Together with the collected dataset, we introduce a multi-stage and strict evaluation metric, which penalizes detection approaches for missing text instances, false positive detection, inaccurate location boxes, and error text categories, developing a new benchmark for the proposed TextDC task. In addition, we extend several state-of-the-art detectors by modifying the prediction head to solve the new task. Then, a generalized text detection and classification framework is designed and formulated. Extensive experiments using the updated methods are conducted on the established benchmark to verify the solvability of the proposed task, the challenges of the dataset, and the effectiveness of the solution.https://www.mdpi.com/2079-9292/12/1/159text detection and classificationmultiple typesText3C dataset
spellingShingle Fenfen Zhou
Yuanqiang Cai
Yingjie Tian
TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
Electronics
text detection and classification
multiple types
Text3C dataset
title TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
title_full TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
title_fullStr TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
title_full_unstemmed TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
title_short TextDC: Exploring Multidimensional Text Detection via a New Benchmark and Solution
title_sort textdc exploring multidimensional text detection via a new benchmark and solution
topic text detection and classification
multiple types
Text3C dataset
url https://www.mdpi.com/2079-9292/12/1/159
work_keys_str_mv AT fenfenzhou textdcexploringmultidimensionaltextdetectionviaanewbenchmarkandsolution
AT yuanqiangcai textdcexploringmultidimensionaltextdetectionviaanewbenchmarkandsolution
AT yingjietian textdcexploringmultidimensionaltextdetectionviaanewbenchmarkandsolution