A Multimodal Framework for Video Caption Generation

Video captioning is a highly challenging computer vision task that automatically describes the video clips using natural language sentences with a clear understanding of the embedded semantics. In this work, a video caption generation framework consisting of discrete wavelet convolutional neural arc...

Full description

Bibliographic Details
Main Authors: Reshmi S. Bhooshan, Suresh K.
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9869626/