Object sequences: encoding categorical and spatial information for a yes/no visual question answering task

The task of visual question answering (VQA) has gained wide popularity in recent times. Effectively solving the VQA task requires the understanding of both the visual content in the image and the language information associated with the text‐based question. In this study, the authors propose a novel...

Full description

Bibliographic Details
Main Authors:	Shivam Garg, Rajeev Srivastava
Format:	Article
Language:	English
Published:	Wiley 2018-12-01
Series:	IET Computer Vision
Subjects:	object sequences spatial object information encoding categorical object information encoding yes-no visual question answering task VQA task language information
Online Access:	https://doi.org/10.1049/iet-cvi.2018.5226

Internet

https://doi.org/10.1049/iet-cvi.2018.5226

Object sequences: encoding categorical and spatial information for a yes/no visual question answering task

Internet

Similar Items