Dense video captioning based on local attention

Abstract Dense video captioning aims to locate multiple events in an untrimmed video and generate captions for each event. Previous methods experienced difficulties in establishing the multimodal feature relationship between frames and captions, resulting in low accuracy of the generated captions. T...

Full description

Bibliographic Details
Main Authors:	Yong Qian, Yingchi Mao, Zhihao Chen, Chang Li, Olano Teah Bloh, Qian Huang
Format:	Article
Language:	English
Published:	Wiley 2023-07-01
Series:	IET Image Processing
Subjects:	2D temporal differential CNN dense video captioning event proposal feature extraction local attention
Online Access:	https://doi.org/10.1049/ipr2.12819

Internet

https://doi.org/10.1049/ipr2.12819

Dense video captioning based on local attention

Internet

Similar Items