Insights into Object Semantics: Leveraging Transformer Networks for Advanced Image Captioning
Image captioning is a technique used to generate descriptive captions for images. Typically, it involves employing a Convolutional Neural Network (CNN) as the encoder to extract visual features, and a decoder model, often based on Recurrent Neural Networks (RNNs), to generate the captions. Recently,...
Main Authors: | Deema Abdal Hafeth, Stefanos Kollias |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-03-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/24/6/1796 |
Similar Items
-
Semantic Representations With Attention Networks for Boosting Image Captioning
by: Deema Abdal Hafeth, et al.
Published: (2023-01-01) -
An Analysis of the Use of Feed-Forward Sub-Modules for Transformer-Based Image Captioning Tasks
by: Raymond Ian Osolo, et al.
Published: (2021-12-01) -
A Context Semantic Auxiliary Network for Image Captioning
by: Jianying Li, et al.
Published: (2023-07-01) -
An Attentive Fourier-Augmented Image-Captioning Transformer
by: Raymond Ian Osolo, et al.
Published: (2021-09-01) -
Structure Preserving Convolutional Attention for Image Captioning
by: Shichen Lu, et al.
Published: (2019-07-01)