Parallel Dense Video Caption Generation with Multi-Modal Features

The task of dense video captioning is to generate detailed natural-language descriptions for an original video, which requires deep analysis and mining of semantic captions to identify events in the video. Existing methods typically follow a localisation-then-captioning sequence within given frame s...

Full description

Bibliographic Details
Main Authors:	Xuefei Huang, Ka-Hou Chan, Wei Ke, Hao Sheng
Format:	Article
Language:	English
Published:	MDPI AG 2023-08-01
Series:	Mathematics
Subjects:	dense video caption video captioning multimodal feature fusion feature extraction neural network
Online Access:	https://www.mdpi.com/2227-7390/11/17/3685

Internet

https://www.mdpi.com/2227-7390/11/17/3685

Parallel Dense Video Caption Generation with Multi-Modal Features

Internet

Similar Items