Spatial Encoding and Multi-layer Joint Encoding Enhanced Transformer for Image Captioning

Spatial Encoding and Multi-layer Joint Encoding Enhanced Transformer for Image Captioning

Image captioning is one of the hot research topics in the field of computer vision.It is a cross-media data analysis task that combines computer vision and natural language processing.It describes the image by understanding the content of the image and generating captions that are both semantically...

Full description

Bibliographic Details
Main Author:	FANG Zhong-jun, ZHANG Jing, LI Dong-dong
Format:	Article
Language:	zho
Published:	Editorial office of Computer Science 2022-10-01
Series:	Jisuanji kexue
Subjects:	image captioning\|transformer\|spatial encoding mechanism\|multi-level joint encoding mechanism\|attention mechanism
Online Access:	https://www.jsjkx.com/fileup/1002-137X/PDF/1002-137X-2022-49-10-151.pdf

Similar Items

Exploring Spatial-Based Position Encoding for Image Captioning
by: Xiaobao Yang, et al.
Published: (2023-11-01)

Review of Image Captioning Methods Based on Encoding-Decoding Technology
by: GENG Yaogang, MEI Hongyan, ZHANG Xing, LI Xiaohui
Published: (2022-10-01)

Switching Text-Based Image Encoders for Captioning Images With Text
by: Arisa Ueda, et al.
Published: (2023-01-01)

Multimodal Abstractive Summarization using bidirectional encoder representations from transformers with attention mechanism
by: Dakshata Argade, et al.
Published: (2024-02-01)

Multi-Source Interactive Stair Attention for Remote Sensing Image Captioning
by: Xiangrong Zhang, et al.
Published: (2023-01-01)

An image caption model based on attention mechanism and deep reinforcement learning
by: Tong Bai, et al.
Published: (2023-10-01)

Image-Caption Model Based on Fusion Feature
by: Yaogang Geng, et al.
Published: (2022-09-01)

An attention-based hybrid deep learning approach for bengali video captioning
by: Md. Shahir Zaoad, et al.
Published: (2023-01-01)

Privacy-Preserving Image Captioning with Deep Learning and Double Random Phase Encoding
by: Antoinette Deborah Martin, et al.
Published: (2022-08-01)

Full-Memory Transformer for Image Captioning
by: Tongwei Lu, et al.
Published: (2023-01-01)

Sequential Recommendation through Graph Neural Networks and Transformer Encoder with Degree Encoding
by: Shuli Wang, et al.
Published: (2021-08-01)

Attentive Generative Adversarial Network with Dual Encoder-Decoder for Shadow Removal
by: He Wang, et al.
Published: (2022-08-01)

Using Random Scrambling in Multi Media Encoding
by: Ghada Tahir Qasim
Published: (2013-03-01)

A Context Semantic Auxiliary Network for Image Captioning
by: Jianying Li, et al.
Published: (2023-07-01)

An Image Captioning Algorithm Based on Combination Attention Mechanism
by: Jinlong Liu, et al.
Published: (2022-04-01)

Spatially encoded polarization transfer for improving the quantitative aspect of 1H–13C HSQC
by: Bikash Baishya, et al.
Published: (2022-12-01)

Fine-Grained Image Recognition by Means of Integrating Transformer Encoder Blocks in a Robust Single-Stage Object Detector
by: Usman Ali, et al.
Published: (2023-06-01)

Cross Encoder-Decoder Transformer with Global-Local Visual Extractor for Medical Image Captioning
by: Hojun Lee, et al.
Published: (2022-02-01)

Voltage Sag Causes Recognition with Fusion of Sparse Auto-Encoder and Attention Unet
by: Rui Fan, et al.
Published: (2022-09-01)

Coastal Land Cover Classification of High-Resolution Remote Sensing Images Using Attention-Driven Context Encoding Network
by: Jifa Chen, et al.
Published: (2020-12-01)

Stylized Image Captioning Model Based on Disentangle-Retrieve-Generate
by: CHEN Zhang-hui, XIONG Yun
Published: (2022-06-01)

Modeling of Hyperparameter Tuned Deep Learning Model for Automated Image Captioning
by: Mohamed Omri, et al.
Published: (2022-01-01)

A Systematic Literature Review on Using the Encoder-Decoder Models for Image Captioning in English and Arabic Languages
by: Ashwaq Alsayed, et al.
Published: (2023-09-01)

Multi-Task Video Captioning with a Stepwise Multimodal Encoder
by: Zihao Liu, et al.
Published: (2022-08-01)

Enformer: Encoder-Based Sparse Periodic Self-Attention Time-Series Forecasting
by: Na Wang, et al.
Published: (2023-01-01)

Dual-Modal Transformer with Enhanced Inter- and Intra-Modality Interactions for Image Captioning
by: Deepika Kumar, et al.
Published: (2022-07-01)

Encoding and storage of information in mechanical metamaterials
by: Meng, Zhiqiang, et al.
Published: (2023)

Attention Guided Encoder-Decoder Network With Multi-Scale Context Aggregation for Land Cover Segmentation
by: Shuyang Wang, et al.
Published: (2020-01-01)

From Plane to Hierarchy: Deformable Transformer for Remote Sensing Image Captioning
by: Runyan Du, et al.
Published: (2023-01-01)

Sentence-CROBI: A Simple Cross-Bi-Encoder-Based Neural Network Architecture for Paraphrase Identification
by: Jesus-German Ortiz-Barajas, et al.
Published: (2022-09-01)

Age-related differences in neural activity for novelty and relational encoding of scenes
by: Leow, Wei Yang Dayton
Published: (2015)

Cross-Modal Retrieval and Semantic Refinement for Remote Sensing Image Captioning
by: Zhengxin Li, et al.
Published: (2024-01-01)

A 19-Bit Small Absolute Matrix Encoder
by: Liming Geng, et al.
Published: (2024-02-01)

PointMM: Point Cloud Semantic Segmentation CNN under Multi-Spatial Feature Encoding and Multi-Head Attention Pooling
by: Ruixing Chen, et al.
Published: (2024-03-01)

Multi-Attention Bottleneck for Gated Convolutional Encoder-Decoder-Based Speech Enhancement
by: Nasir Saleem, et al.
Published: (2023-01-01)

JSCC-Cast: A Joint Source Channel Coding Video Encoding and Transmission System with Limited Digital Metadata
by: Jose Balsa, et al.
Published: (2021-09-01)

Automated audio captioning: an overview of recent progress and new challenges
by: Xinhao Mei, et al.
Published: (2022-10-01)

Encoder–decoder-based image transformation approach for integrating multiple spatial forecasts
by: Hirotaka Hachiya, et al.
Published: (2023-06-01)

Insights into Object Semantics: Leveraging Transformer Networks for Advanced Image Captioning
by: Deema Abdal Hafeth, et al.
Published: (2024-03-01)

Using DNA to Encode Text Files
by: Yassin Ismail, et al.
Published: (2014-07-01)