Language matters: a weakly SupervisedVision-Language pre-training approach for scene text detection and spotting

Language matters: a weakly SupervisedVision-Language pre-training approach for scene text detection and spotting

Recently, Vision-Language Pre-training (VLP) techniques have greatly benefited various vision-language tasks by jointly learning visual and textual representations, which intuitively helps in Optical Character Recognition (OCR) tasks due to the rich visual and textual information in scene text image...

Full description

Bibliographic Details
Main Authors:	Xue, C, Hao, Y, Lu, S, Torr, P, Bai, S
Format:	Conference item
Language:	English
Published:	Springer 2022

Similar Items

Migratable urban street scene sensing method based on vision language pre-trained model
by: Yan Zhang, et al.
Published: (2022-09-01)

Benchmarking robustness of adaptation methods on pre-trained vision-language models
by: Chen, S, et al.
Published: (2024)

TLWSR: Weakly supervised real‐world scene text image super‐resolution using text label
by: Qin Shi, et al.
Published: (2023-07-01)

RS-CLIP: Zero shot remote sensing scene classification via contrastive vision-language supervision
by: Xiang Li, et al.
Published: (2023-11-01)

Text data augmentation and pre-trained Language Model for enhancing text classification of low-resource languages
by: Atabay Ziyaden, et al.
Published: (2024-03-01)

Mobile application on a scene text spotting
by: Chua, Kah Yong
Published: (2020)

Mobile application on a scene text spotting
by: Nguyen Doan Hoang Lam
Published: (2021)

Satellite and instrument entity recognition using a pre-trained language model with distant supervision
by: Ming Lin, et al.
Published: (2022-12-01)

Weakly-supervised fingerspelling recognition in British Sign Language videos
by: Prajwal, KR, et al.
Published: (2022)

CPT: Colorful Prompt Tuning for pre-trained vision-language models
by: Yuan Yao, et al.
Published: (2024-01-01)

Leveraging Pre-Trained Language Model for Summary Generation on Short Text
by: Shuai Zhao, et al.
Published: (2020-01-01)

Weakly- and semi-supervised panoptic segmentation
by: Li, Q, et al.
Published: (2018)

Abstractive text summarization using Pre-Trained Language Model "Text-to-Text Transfer Transformer (T5)"
by: Qurrota A’yuna Itsnaini, et al.
Published: (2023-04-01)

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
by: Chen, Tianlong, et al.
Published: (2022)

Crowd-supervised training of spoken language systems
by: McGraw, Ian C. (Ian Carmichael)
Published: (2012)

CommuSpotter: Scene Text Spotting with Multi-Task Communication
by: Liang Zhao, et al.
Published: (2023-11-01)

A survey on methods, datasets and implementations for scene text spotting
by: Pablo Blanco‐Medina, et al.
Published: (2022-11-01)

Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
by: Sanjana Gunna, et al.
Published: (2022-03-01)

Multimodal detection of hateful memes by applying a vision-language pre-training model.
by: Yuyang Chen, et al.
Published: (2022-01-01)

Affect Analysis in Arabic Text: Further Pre-Training Language Models for Sentiment and Emotion
by: Wafa Alshehri, et al.
Published: (2023-05-01)

Improving text mining in plant health domain with GAN and/or pre-trained language model
by: Shufan Jiang, et al.
Published: (2023-02-01)

Electric Power Audit Text Classification With Multi-Grained Pre-Trained Language Model
by: Qinglin Meng, et al.
Published: (2023-01-01)

A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
by: Evans Kotei, et al.
Published: (2023-03-01)

Pedestrian detection algorithm in traffic scene based on weakly supervised hierarchical deep model
by: Yingfeng Cai, et al.
Published: (2016-02-01)

Discovering class-specific pixels for weakly-supervised semantic segmentation
by: Chaudhry, A, et al.
Published: (2017)

Weakly Supervised Building Semantic Segmentation Based on Spot-Seeds and Refinement Process
by: Khaled Moghalles, et al.
Published: (2022-05-01)

Classification and localization of maize leaf spot disease based on weakly supervised learning
by: Shuai Yang, et al.
Published: (2023-05-01)

Language-aware vision transformer for referring segmentation
by: Yang, Z, et al.
Published: (2024)

Adapting vs. Pre-training Language Models for Historical Languages
by: Enrique Manjavacas, et al.
Published: (2022-06-01)

Pre-Trained Transformer-Based Models for Text Classification Using Low-Resourced Ewe Language
by: Victor Kwaku Agbesi, et al.
Published: (2023-12-01)

Investigating Prompt Learning for Chinese Few-Shot Text Classification with Pre-Trained Language Models
by: Chengyu Song, et al.
Published: (2022-11-01)

Fine-Grained Sentiment-Controlled Text Generation Approach Based on Pre-Trained Language Model
by: Linan Zhu, et al.
Published: (2022-12-01)

Scene Reconstruction Algorithm for Unstructured Weak-Texture Regions Based on Stereo Vision
by: Mingju Chen, et al.
Published: (2023-05-01)

Pre-Trained Language Models and Their Applications
by: Haifeng Wang, et al.
Published: (2023-06-01)

Training Pre-Service Language Teachers
by: Janez Skela
Published: (2004-12-01)

The english language in Indian media scene as an element of language policies
by: V V Matvienko
Published: (2012-03-01)

Weakly-supervised cross-domain road scene segmentation via multi-level curriculum adaptation
by: Lv, Fengmao, et al.
Published: (2022)

Weakly supervised skin lesion segmentation based on spot‐seeds guided optimal regions
by: Zaid Al‐Huda, et al.
Published: (2023-01-01)

Inducing high energy-latency of large vision-language models with verbose images
by: Gao, K, et al.
Published: (2024)

Language of vision /
by: 434196 Kepes, Gyorgy
Published: (1959)