Review of Visual Question Answering Technology
Visual question answering (VQA) is a popular cross-modal task that combines natural language pro-cessing and computer vision techniques. The main objective of this task is to enable computers to intelligently recognize and retrieve visual content and provide accurate answers. VQA involves the integr...
Main Author: | WANG Yu, SUN Haichun |
---|---|
Format: | Article |
Language: | zho |
Published: |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
2023-07-01
|
Series: | Jisuanji kexue yu tansuo |
Subjects: | |
Online Access: | http://fcst.ceaj.org/fileup/1673-9418/PDF/2303025.pdf |
Similar Items
-
A Comprehensive Review and Open Challenges on Visual Question Answering Models
by: Fasi Ahamad Shaik, et al.
Published: (2023-09-01) -
Improving visual question answering for remote sensing via alternate-guided attention and combined loss
by: Jiangfan Feng, et al.
Published: (2023-08-01) -
The multi-modal fusion in visual question answering: a review of attention mechanisms
by: Siyu Lu, et al.
Published: (2023-05-01) -
SBVQA 2.0: Robust End-to-End Speech-Based Visual Question Answering for Open-Ended Questions
by: Faris Alasmary, et al.
Published: (2023-01-01) -
A multi-scale contextual attention network for remote sensing visual question answering
by: Jiangfan Feng, et al.
Published: (2024-02-01)