Language-aware vision transformer for referring segmentation

Referring segmentation is a fundamental vision-language task that aims to segment out an object from an image or video in accordance with a natural language description. One of the key challenges behind this task is leveraging the referring expression for highlighting relevant positions in the image...

Бүрэн тодорхойлолт

Номзүйн дэлгэрэнгүй
Үндсэн зохиолчид: Yang, Z, Wang, J, Ye, X, Tang, Y, Chen, K, Zhao, H, Torr, PHS
Формат: Journal article
Хэл сонгох:English
Хэвлэсэн: IEEE 2024

Ижил төстэй зүйлс