End‐to‐end visual grounding via region proposal networks and bilinear pooling
Phrase‐based visual grounding aims to localise the object in the image referred by a textual query phrase. Most existing approaches adopt a two‐stage mechanism to address this problem: first, an off‐the‐shelf proposal generation model is adopted to extract region‐based visual features, and then a de...
Main Authors: | Chenchao Xiang, Zhou Yu, Suguo Zhu, Jun Yu, Xiaokang Yang |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2019-03-01
|
Series: | IET Computer Vision |
Subjects: | |
Online Access: | https://doi.org/10.1049/iet-cvi.2018.5104 |
Similar Items
-
Japanese Neural Incremental Text-to-Speech Synthesis Framework With an Accent Phrase Input
by: Tomoya Yanagita, et al.
Published: (2023-01-01) -
Type shifting and the number system in Persian
by: Amirmohammad Shirzad, et al.
Published: (2023-03-01) -
Determiner Phrase in Persian
by: reza sahraei
Published: (2010-11-01) -
Phrase mineure et phrase passive topoké
by: Collard LIMBOMBE LIANDJA
Published: (2023-02-01) -
Pride and Discretion
by: Jean-Jacques Lecercle
Published: (2017-09-01)