Debiasing visual question and answering with answer preference

Visual Question Answering (VQA) requires models to generate a reasonable answer with given an image and corresponding question. It requires strong reasoning capabilities for two kinds of input features, namely image and question. However, most state-of-the-art results heavily rely on superficial cor...

Full description

Bibliographic Details
Main Author:	Zhang, Xinye
Other Authors:	Zhang Hanwang
Format:	Final Year Project (FYP)
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/137906

Description
Summary:	Visual Question Answering (VQA) requires models to generate a reasonable answer with given an image and corresponding question. It requires strong reasoning capabilities for two kinds of input features, namely image and question. However, most state-of-the-art results heavily rely on superficial correlations in the dataset, given it delicately balancing the dataset is almost impossible. In this paper, we proposed a simple method by using answer preference to reduce the impact of data bias and improve the robustness of VQA models against prior changes. Two pipelines of using answer preference, at the training stage as well as the inference stage, are experimented and achieved genuine improvement on the VQA-CP dataset. VQA-CP dataset is designed to test the performance of the VQA model under domain shift.

Debiasing visual question and answering with answer preference

Similar Items