Enhancing semantics with multi‐objective reinforcement learning for video description

Abstract Video description is challenging due to the high complexity of translating visual content into language. In most popular attention‐based pipelines for this task, visual features and previously generated words are usually concatenated as a vector to predict the current word. However, the err...

Full description

Bibliographic Details
Main Authors:	Qinyu Li, Longyu Yang, Pengjie Tang, Hanli Wang
Format:	Article
Language:	English
Published:	Wiley 2021-12-01
Series:	Electronics Letters
Subjects:	Image recognition Computer vision and image processing techniques Video signal processing Document processing and analysis techniques Natural language processing Neural nets
Online Access:	https://doi.org/10.1049/ell2.12334

Internet

https://doi.org/10.1049/ell2.12334

Enhancing semantics with multi‐objective reinforcement learning for video description

Internet

Similar Items