Language matters: a weakly SupervisedVision-Language pre-training approach for scene text detection and spotting
Recently, Vision-Language Pre-training (VLP) techniques have greatly benefited various vision-language tasks by jointly learning visual and textual representations, which intuitively helps in Optical Character Recognition (OCR) tasks due to the rich visual and textual information in scene text image...
Main Authors: | Xue, C, Hao, Y, Lu, S, Torr, P, Bai, S |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Springer
2022
|
Similar Items
-
Migratable urban street scene sensing method based on vision language pre-trained model
by: Yan Zhang, et al.
Published: (2022-09-01) -
Benchmarking robustness of adaptation methods on pre-trained vision-language models
by: Chen, S, et al.
Published: (2024) -
TLWSR: Weakly supervised real‐world scene text image super‐resolution using text label
by: Qin Shi, et al.
Published: (2023-07-01) -
RS-CLIP: Zero shot remote sensing scene classification via contrastive vision-language supervision
by: Xiang Li, et al.
Published: (2023-11-01) -
Text data augmentation and pre-trained Language Model for enhancing text classification of low-resource languages
by: Atabay Ziyaden, et al.
Published: (2024-03-01)