A Video Question Answering Model Based on Knowledge Distillation
Video question answering (QA) is a cross-modal task that requires understanding the video content to answer questions. Current techniques address this challenge by employing stacked modules, such as attention mechanisms and graph convolutional networks. These methods reason about the semantics of vi...
Main Authors: | Zhuang Shao, Jiahui Wan, Linlin Zong |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-06-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/14/6/328 |
Similar Items
-
TASTA: Text‐Assisted Spatial and Temporal Attention Network for Video Question Answering
by: Tian Wang, et al.
Published: (2023-04-01) -
Language Bias-Driven Self-Knowledge Distillation with Generalization Uncertainty for Reducing Language Bias in Visual Question Answering
by: Desen Yuan, et al.
Published: (2022-07-01) -
Advancements in Complex Knowledge Graph Question Answering: A Survey
by: Yiqing Song, et al.
Published: (2023-10-01) -
Survey of Question Answering Based on Knowledge Graph Reasoning
by: SA Rina, LI Yanling, LIN Min
Published: (2022-08-01) -
Survey of Multimodal Medical Question Answering
by: Hilmi Demirhan, et al.
Published: (2023-12-01)