Transformer Module Networks for Systematic Generalization in Visual Question Answering
Transformer-based models achieve great performance on Visual Question Answering (VQA). How- ever, when we evaluate them on systematic generalization, i.e., handling novel combinations of known concepts, their performance degrades. Neural Module Networks (NMNs) are a promising approach for systematic...
Main Authors: | , , , , |
---|---|
Format: | Article |
Published: |
Center for Brains, Minds and Machines (CBMM)
2022
|
Online Access: | https://hdl.handle.net/1721.1/139843 |