Stack-VS : stacked visual-semantic attention for image caption generation

Recently, automatic image caption generation has been an important focus of the work on multimodal translation task. Existing approaches can be roughly categorized into two classes, top-down and bottom-up, the former transfers the image information (called as visual-level feature) directly into a ca...

Full description

Bibliographic Details
Main Authors:	Cheng, Ling, Wei, Wei, Mao, Xianling, Liu, Yong, Miao, Chunyan
Other Authors:	School of Computer Science and Engineering
Format:	Journal Article
Language:	English
Published:	2021
Subjects:	Engineering::Computer science and engineering Image Captioning Recurrent Neural Network
Online Access:	https://hdl.handle.net/10356/148460

Internet

https://hdl.handle.net/10356/148460

Stack-VS : stacked visual-semantic attention for image caption generation

Internet

Similar Items