Image caption generation using Visual Attention Prediction and Contextual Spatial Relation Extraction

Abstract Automatic caption generation with attention mechanisms aims at generating more descriptive captions containing coarser to finer semantic contents in the image. In this work, we use an encoder-decoder framework employing Wavelet transform based Convolutional Neural Network (WCNN) with two le...

Full description

Bibliographic Details
Main Authors: Reshmi Sasibhooshan, Suresh Kumaraswamy, Santhoshkumar Sasidharan
Format: Article
Language:English
Published: SpringerOpen 2023-02-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-023-00693-9