Language-aware vision transformer for referring segmentation

Language-aware vision transformer for referring segmentation

Referring segmentation is a fundamental vision-language task that aims to segment out an object from an image or video in accordance with a natural language description. One of the key challenges behind this task is leveraging the referring expression for highlighting relevant positions in the image...

पूर्ण विवरण

ग्रंथसूची विवरण
मुख्य लेखकों:	Yang, Z, Wang, J, Ye, X, Tang, Y, Chen, K, Zhao, H, Torr, PHS
स्वरूप:	Journal article
भाषा:	English
प्रकाशित:	IEEE 2024

समान संसाधन

LAVT: Language-Aware Vision Transformer for referring image segmentation
द्वारा: Yang, Z, और अन्य
प्रकाशित: (2022)

Semantics-aware dynamic localization and refinement for referring image segmentation
द्वारा: Yang, Z, और अन्य
प्रकाशित: (2023)

Vision transformers: from semantic segmentation to dense prediction
द्वारा: Zhang, L, और अन्य
प्रकाशित: (2024)

Hierarchical interaction network for video object segmentation from referring expressions
द्वारा: Yang, Z, और अन्य
प्रकाशित: (2021)

Behind every domain there is a shift: adapting distortion-aware vision transformers for panoramic semantic segmentation
द्वारा: Zhang, J, और अन्य
प्रकाशित: (2024)

LUNA: language as continuing anchors for referring expression comprehension
द्वारा: Liang, Y, और अन्य
प्रकाशित: (2023)

Behind every domain there is a shift: adapting distortion-aware vision transformers for panoramic semantic segmentation
द्वारा: Zhang, J, और अन्य
प्रकाशित: (2024)

An empirical study of detection-based video instance segmentation
द्वारा: Wang, Q, और अन्य
प्रकाशित: (2020)

Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers
द्वारा: Zheng, S, और अन्य
प्रकाशित: (2021)

Geometric motion segmentation and model selection
द्वारा: Torr, PHS
प्रकाशित: (1998)

Reference-aware language models
द्वारा: Yang, Z, और अन्य
प्रकाशित: (2017)

Unifying training and inference for panoptic segmentation
द्वारा: Li, Q, और अन्य
प्रकाशित: (2020)

Dynamic graph cuts and their applications in computer vision
द्वारा: Kohli, P, और अन्य
प्रकाशित: (2010)

Outlier detection and motion segmentation
द्वारा: Torr, PHS, और अन्य
प्रकाशित: (1993)

Patch-based separable transformer for visual recognition
द्वारा: Sun, S, और अन्य
प्रकाशित: (2022)

Semantic-aware auto-encoders for self-supervised representation learning
द्वारा: Wang, G, और अन्य
प्रकाशित: (2022)

Vision transformer with progressive sampling
द्वारा: Yue, X, और अन्य
प्रकाशित: (2022)

Improving few-shot learning by spatially-aware matching and crosstransformer
द्वारा: Zhang, H, और अन्य
प्रकाशित: (2023)

Concerning Bayesian motion segmentation, model averaging, matching and the trifocal tensor
द्वारा: Torr, PHS, और अन्य
प्रकाशित: (2006)

Bottom-up Instance Segmentation using Deep Higher-Order CRFs
द्वारा: Arnab, A, और अन्य
प्रकाशित: (2016)

An object category specific mrf for segmentation
द्वारा: Kumar, MP, और अन्य
प्रकाशित: (2007)

Learning layered motion segmentations of video
द्वारा: Kumar, MP, और अन्य
प्रकाशित: (2005)

On the robustness of semantic segmentation models to adversarial attacks
द्वारा: Arnab, A, और अन्य
प्रकाशित: (2019)

SegPGD: an effective and efficient adversarial attack for evaluating and boosting segmentation robustness
द्वारा: Gu, J, और अन्य
प्रकाशित: (2022)

Target identity-aware network flow for online multiple target tracking
द्वारा: Dehghan, A, और अन्य
प्रकाशित: (2015)

Learning layered motion segmentations of video
द्वारा: Pawan Kumar, M, और अन्य
प्रकाशित: (2007)

Urban 3D semantic modelling using stereo vision
द्वारा: Sengupta, S, और अन्य
प्रकाशित: (2013)

Object-aware vision and language navigation for domestic robots
द्वारा: Zhao, Weiyi
प्रकाशित: (2022)

Discovering class-specific pixels for weakly-supervised semantic segmentation
द्वारा: Chaudhry, A, और अन्य
प्रकाशित: (2017)

OBJCUT: efficient segmentation using top-down and bottom-up cues
द्वारा: Kumar, MP, और अन्य
प्रकाशित: (2009)

Practical Techniques for Vision-Language Segmentation Model in Remote Sensing
द्वारा: Y. Lin, और अन्य
प्रकाशित: (2024-06-01)

Benchmarking robustness of adaptation methods on pre-trained vision-language models
द्वारा: Chen, S, और अन्य
प्रकाशित: (2024)

Occluded video instance segmentation: A benchmark
द्वारा: Qi, J, और अन्य
प्रकाशित: (2022)

GeoNet++: Iterative geometric neural network with edge-aware refinement for joint depth and surface normal estimation
द्वारा: Qi, X, और अन्य
प्रकाशित: (2020)

Scalable cascade inference for semantic image segmentation
द्वारा: Sturgess, P, और अन्य
प्रकाशित: (2012)

POSECUT: simultaneous segmentation and 3d pose estimation of humans using dynamic graph-cuts
द्वारा: Bray, M, और अन्य
प्रकाशित: (2006)

Deep FusionNet for point cloud semantic segmentation
द्वारा: Zhang, F, और अन्य
प्रकाशित: (2020)

Associative hierarchical CRFs for object class image segmentation
द्वारा: Ladický, L, और अन्य
प्रकाशित: (2009)

Prompting a pretrained transformer can be a universal approximator
द्वारा: Petrov, A, और अन्य
प्रकाशित: (2024)

Spatio-temporal action instance segmentation and localisation
द्वारा: Saha, S, और अन्य
प्रकाशित: (2020)