Hierarchical interaction network for video object segmentation from referring expressions

In this paper, we investigate the problem of video object segmentation from referring expressions (VOSRE). Conventional methods typically perform multi-modal fusion based on linguistic features and the visual features extracted from the top layer of the visual encoder, which limits these models'...

ver descrição completa

Detalhes bibliográficos
Main Authors:	Yang, Z, Tang, Y, Bertinetto, L, Zhao, H, Torr, PHS
Formato:	Conference item
Idioma:	English
Publicado em:	British Machine Vision Association 2021

Hierarchical interaction network for video object segmentation from referring expressions

Registos relacionados