Counterfactual samples synthesizing and training for robust visual question answering

Today's VQA models still tend to capture superficial linguistic correlations in the training set and fail to generalize to the test set with different QA distributions. To reduce these language biases, recent VQA works introduce an auxiliary question-only model to regularize the training of tar...

Full description

Bibliographic Details
Main Authors: Chen, Long, Zheng, Yuhang, Niu, Yulei, Zhang, Hanwang, Xiao, Jun
Other Authors: School of Computer Science and Engineering
Format: Journal Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171830