Visual Question Answering Method Based on Counterfactual Thinking

Visual question answering(VQA) is a multi-modal task that combines computer vision and natural language proces-sing,which is extremely challenging.However,the current VQA model is often misled by the apparent correlation in the data,and the output of the model is directly guided by language bias.Man...

Full description

Bibliographic Details
Main Author: YUAN De-sen, LIU Xiu-jing, WU Qing-bo, LI Hong-liang, MENG Fan-man, NGAN King-ngi, XU Lin-feng
Format: Article
Language:zho
Published: Editorial office of Computer Science 2022-12-01
Series:Jisuanji kexue
Subjects:
Online Access:https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-12-229.pdf