Detection and rectification of arbitrary shaped scene texts by using text keypoints and links

Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc. This paper presents a mask-guided multi-task network that detects and rectifies scene texts of arbitrary shapes reliab...

Full description

Bibliographic Details
Main Authors: Xue, Chuhui, Lu, Shijian, Hoi, Steven
Other Authors: School of Computer Science and Engineering
Format: Journal Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/161795
_version_ 1811680889523929088
author Xue, Chuhui
Lu, Shijian
Hoi, Steven
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Xue, Chuhui
Lu, Shijian
Hoi, Steven
author_sort Xue, Chuhui
collection NTU
description Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc. This paper presents a mask-guided multi-task network that detects and rectifies scene texts of arbitrary shapes reliably. Three types of keypoints are detected which specify the centre line and so the shape of text instances accurately. In addition, four types of keypoint links are detected of which the horizontal links associate the detected keypoints of each text instance and the vertical links predict a pair of landmark points (for each keypoint) along the upper and lower text boundary, respectively. Scene texts can be located and rectified by linking up the associated landmark points (giving localization polygon boxes) and transforming the polygon boxes via thin plate spline, respectively. Extensive experiments over several public datasets show that the use of text keypoints is tolerant to the variation in text orientations, lengths, and curvatures, and it achieves competitive scene text detection and rectification performance as compared with state-of-the-art methods.
first_indexed 2024-10-01T03:32:13Z
format Journal Article
id ntu-10356/161795
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:32:13Z
publishDate 2022
record_format dspace
spelling ntu-10356/1617952022-09-20T05:36:48Z Detection and rectification of arbitrary shaped scene texts by using text keypoints and links Xue, Chuhui Lu, Shijian Hoi, Steven School of Computer Science and Engineering Engineering::Computer science and engineering Scene Text Recognition Deep Learning Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc. This paper presents a mask-guided multi-task network that detects and rectifies scene texts of arbitrary shapes reliably. Three types of keypoints are detected which specify the centre line and so the shape of text instances accurately. In addition, four types of keypoint links are detected of which the horizontal links associate the detected keypoints of each text instance and the vertical links predict a pair of landmark points (for each keypoint) along the upper and lower text boundary, respectively. Scene texts can be located and rectified by linking up the associated landmark points (giving localization polygon boxes) and transforming the polygon boxes via thin plate spline, respectively. Extensive experiments over several public datasets show that the use of text keypoints is tolerant to the variation in text orientations, lengths, and curvatures, and it achieves competitive scene text detection and rectification performance as compared with state-of-the-art methods. 2022-09-20T05:36:48Z 2022-09-20T05:36:48Z 2022 Journal Article Xue, C., Lu, S. & Hoi, S. (2022). Detection and rectification of arbitrary shaped scene texts by using text keypoints and links. Pattern Recognition, 124, 108494-. https://dx.doi.org/10.1016/j.patcog.2021.108494 0031-3203 https://hdl.handle.net/10356/161795 10.1016/j.patcog.2021.108494 2-s2.0-85122476297 124 108494 en Pattern Recognition © 2021 Elsevier Ltd. All rights reserved.
spellingShingle Engineering::Computer science and engineering
Scene Text Recognition
Deep Learning
Xue, Chuhui
Lu, Shijian
Hoi, Steven
Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
title Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
title_full Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
title_fullStr Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
title_full_unstemmed Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
title_short Detection and rectification of arbitrary shaped scene texts by using text keypoints and links
title_sort detection and rectification of arbitrary shaped scene texts by using text keypoints and links
topic Engineering::Computer science and engineering
Scene Text Recognition
Deep Learning
url https://hdl.handle.net/10356/161795
work_keys_str_mv AT xuechuhui detectionandrectificationofarbitraryshapedscenetextsbyusingtextkeypointsandlinks
AT lushijian detectionandrectificationofarbitraryshapedscenetextsbyusingtextkeypointsandlinks
AT hoisteven detectionandrectificationofarbitraryshapedscenetextsbyusingtextkeypointsandlinks