Path-Wise Attention Memory Network for Visual Question Answering
Visual question answering (VQA) is regarded as a multi-modal fine-grained feature fusion task, which requires the construction of multi-level and omnidirectional relations between nodes. One main solution is the composite attention model which is composed of co-attention (CA) and self-attention(SA)....
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-09-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/10/18/3244 |