Stack-VS : stacked visual-semantic attention for image caption generation

Stack-VS : stacked visual-semantic attention for image caption generation

Show other versions (1)

Recently, automatic image caption generation has been an important focus of the work on multimodal translation task. Existing approaches can be roughly categorized into two classes, top-down and bottom-up, the former transfers the image information (called as visual-level feature) directly into a ca...

Full description

Bibliographic Details
Main Authors:	Cheng, Ling, Wei, Wei, Mao, Xianling, Liu, Yong, Miao, Chunyan
Other Authors:	School of Computer Science and Engineering
Format:	Journal Article
Language:	English
Published:	2021
Subjects:	Engineering::Computer science and engineering Image Captioning Recurrent Neural Network
Online Access:	https://hdl.handle.net/10356/148460

Similar Items

Stack-VS: Stacked Visual-Semantic Attention for Image Caption Generation
by: Ling Cheng, et al.
Published: (2020-01-01)

Video captioning with stacked attention and semantic hard pull
by: Md. Mushfiqur Rahman, et al.
Published: (2021-08-01)

Video Captioning Based on Channel Soft Attention and Semantic Reconstructor
by: Zhou Lei, et al.
Published: (2021-02-01)

Novel Object Captioning with Semantic Match from External Knowledge
by: Sen Du, et al.
Published: (2023-07-01)

VAA: Visual Aligning Attention Model for Remote Sensing Image Captioning
by: Zhengyuan Zhang, et al.
Published: (2019-01-01)

Cross-modal graph with meta concepts for video captioning
by: Wang, Hao, et al.
Published: (2022)

Cross-Lingual Image Caption Generation Based on Visual Attention Model
by: Bin Wang, et al.
Published: (2020-01-01)

Semantic Representations With Attention Networks for Boosting Image Captioning
by: Deema Abdal Hafeth, et al.
Published: (2023-01-01)

Semantic-filtered Soft-Split-Aware video captioning with audio-augmented feature
by: Xu, Yuecong, et al.
Published: (2021)

Context-aware visual policy network for fine-grained image captioning
by: Zha, Zheng-Jun, et al.
Published: (2022)

Social Image Captioning: Exploring Visual Attention and User Attention
by: Leiquan Wang, et al.
Published: (2018-02-01)

Fashion-Oriented Image Captioning with External Knowledge Retrieval and Fully Attentive Gates
by: Nicholas Moratelli, et al.
Published: (2023-01-01)

Cascade Semantic Fusion for Image Captioning
by: Shiwei Wang, et al.
Published: (2019-01-01)

UAT: Universal Attention Transformer for Video Captioning
by: Heeju Im, et al.
Published: (2022-06-01)

Learning to collocate Visual-Linguistic Neural Modules for image captioning
by: Yang, Xu, et al.
Published: (2023)

VSAM-Based Visual Keyword Generation for Image Caption
by: Suya Zhang, et al.
Published: (2021-01-01)

An Attentive Fourier-Augmented Image-Captioning Transformer
by: Raymond Ian Osolo, et al.
Published: (2021-09-01)

Deconfounded image captioning: a causal retrospect
by: Yang, Xu, et al.
Published: (2022)

The effects of captioning texts and caption ordering on L2 listening comprehension and vocabulary learning
by: Fatemeh Alikhani, et al.
Published: (2013-07-01)

Parallel Dense Video Caption Generation with Multi-Modal Features
by: Xuefei Huang, et al.
Published: (2023-08-01)

Caption for Cover of Volume 2 Issue 1
by: Amy Christian
Published: (2011-12-01)

Panoptic Segmentation-Based Attention for Image Captioning
by: Wenjie Cai, et al.
Published: (2020-01-01)

Affective Image Captioning for Visual Artworks Using Emotion-Based Cross-Attention Mechanisms
by: Shintaro Ishikawa, et al.
Published: (2023-01-01)

A Context Semantic Auxiliary Network for Image Captioning
by: Jianying Li, et al.
Published: (2023-07-01)

Hierarchical Attention-Based Fusion for Image Caption With Multi-Grained Rewards
by: Chunlei Wu, et al.
Published: (2020-01-01)

Retrieval Topic Recurrent Memory Network for Remote Sensing Image Captioning
by: Binqiang Wang, et al.
Published: (2020-01-01)

Folk Games Image Captioning using Object Attention
by: Saiful Akbar, et al.
Published: (2023-08-01)

An Image Captioning Algorithm Based on Combination Attention Mechanism
by: Jinlong Liu, et al.
Published: (2022-04-01)

A Unified Visual and Linguistic Semantics Method for Enhanced Image Captioning
by: Jiajia Peng, et al.
Published: (2024-03-01)

Step by Step: A Gradual Approach for Dense Video Captioning
by: Wangyu Choi, et al.
Published: (2023-01-01)

Separate Syntax and Semantics: Part-of-Speech-Guided Transformer for Image Captioning
by: Dong Wang, et al.
Published: (2022-11-01)

Extracting Structured Supervision From Captions for Weakly Supervised Semantic Segmentation
by: Daniel R. Vilar, et al.
Published: (2021-01-01)

Context-Driven Image Caption With Global Semantic Relations of the Named Entities
by: Yun Jing, et al.
Published: (2020-01-01)

Multi-Gate Attention Network for Image Captioning
by: Weitao Jiang, et al.
Published: (2021-01-01)

Semantic-Guided Selective Representation for Image Captioning
by: Yinan Li, et al.
Published: (2023-01-01)

An attention-based hybrid deep learning approach for bengali video captioning
by: Md. Shahir Zaoad, et al.
Published: (2023-01-01)

Hybrid Attention Distribution and Factorized Embedding Matrix in Image Captioning
by: Jian Wang, et al.
Published: (2020-01-01)

Structure Preserving Convolutional Attention for Image Captioning
by: Shichen Lu, et al.
Published: (2019-07-01)

Image Captioning with multi-level similarity-guided semantic matching
by: Jiesi Li, et al.
Published: (2021-12-01)

Text Augmentation Using BERT for Image Captioning
by: Viktar Atliha, et al.
Published: (2020-08-01)