Transformer Module Networks for Systematic Generalization in Visual Question Answering

Transformer-based models achieve great performance on Visual Question Answering (VQA). How- ever, when we evaluate them on systematic generalization, i.e., handling novel combinations of known concepts, their performance degrades. Neural Module Networks (NMNs) are a promising approach for systematic...

Full description

Bibliographic Details
Main Authors: Yamada, Moyuru, D'Amario, Vanessa, Takemoto, Kentaro, Boix, Xavier, Sasaki, Tomotake
Format: Article
Published: Center for Brains, Minds and Machines (CBMM) 2022
Online Access:https://hdl.handle.net/1721.1/139843