Enhancing performance in video grounding tasks through the use of captions
This report explores enhancing video grounding tasks by utilizing generated captions, addressing the challenge posed by sparse annotations in video datasets. We took inspiration from the PCNet model which uses caption-guided attention to fuse the captions generated by Parallel Dynamic Video Captioni...
Main Author: | Liu, Xinran |
---|---|
Other Authors: | Sun Aixin |
Format: | Final Year Project (FYP) |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175356 |
Similar Items
-
Enhancing performance in video grounding tasks through the use of attention module
by: Do Duc Anh
Published: (2024) -
Neural image and video captioning
by: Lam, Ting En
Published: (2024) -
Grounded semantic parsing using captioned videos
by: Ross, Candace Cheronda
Published: (2018) -
Caption-Guided Interpretable Video Anomaly Detection Based on Memory Similarity
by: Yuzhi Shi, et al.
Published: (2024-01-01) -
Neural tracking of phrases in spoken language comprehension is automatic and task-dependent
by: Sanne ten Oever, et al.
Published: (2022-07-01)