Full-Memory Transformer for Image Captioning
The Transformer-based approach represents the state-of-the-art in image captioning. However, existing studies have shown Transformer has a problem that irrelevant tokens with overlapping neighbors incorrectly attend to each other with relatively large attention scores. We believe that this limitatio...
Main Authors: | Tongwei Lu, Jiarong Wang, Fen Min |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | Symmetry |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-8994/15/1/190 |
Similar Items
-
Multi-Gate Attention Network for Image Captioning
by: Weitao Jiang, et al.
Published: (2021-01-01) -
From Plane to Hierarchy: Deformable Transformer for Remote Sensing Image Captioning
by: Runyan Du, et al.
Published: (2023-01-01) -
Insights into Object Semantics: Leveraging Transformer Networks for Advanced Image Captioning
by: Deema Abdal Hafeth, et al.
Published: (2024-03-01) -
An Attentive Fourier-Augmented Image-Captioning Transformer
by: Raymond Ian Osolo, et al.
Published: (2021-09-01) -
Hybrid Attention Distribution and Factorized Embedding Matrix in Image Captioning
by: Jian Wang, et al.
Published: (2020-01-01)