Learning deep networks for image classification

Visual Question Answering stands at the intersection of computer vision and natural language processing, bridging the semantic gap between visual information and textual queries. The dominant approach for this complex task, end-to-end models, do not demonstrate the difference between visual processi...

Full description

Bibliographic Details
Main Author: Zhou, Yixuan
Other Authors: Hanwang Zhang
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175074